-
Boost application performance using asynchronous
I/O
Learn when and how to use the POSIX AIO API
M. Tim Jones, Consultant Engineer, Emulex
Summary: The most common input/output (I/O) model used in Linux
is synchronous I/O. After a request
is made in this model, the application blocks until the request
is satisfied. This is a great paradigm because the
calling application requires no central processing unit (CPU)
while it awaits the completion of the I/O request.
But in some cases there's a need to overlap an I/O request with
other processing. The Portable Operating
System Interface (POSIX) asynchronous I/O (AIO) application
program interface (API) provides this
capability. In this article, get an overview of the API and see
how to use it.
Date: 29 Aug 2006
Level: Intermediate
Also available in: Portuguese
Activity: 27874 views
Comments: 0 (Add comments)
Average rating (based on 142 votes)
Introduction to AIO
Linux asynchronous I/O is a relatively recent addition to the
Linux kernel. It's a standard feature of the 2.6
kernel, but you can find patches for 2.4. The basic idea behind
AIO is to allow a process to initiate a number
of I/O operations without having to block or wait for any to
complete. At some later time, or after being
notified of I/O completion, the process can retrieve the results
of the I/O.
I/O models
Before digging into the AIO API, let's explore the different I/O
models that are available under Linux. This
isn't intended as an exhaustive review, but rather aims to cover
the most common models to illustrate their
differences from asynchronous I/O. Figure 1 shows synchronous
and asynchronous models, as well as
blocking and non-blocking models.
Figure 1. Simplified matrix of basic Linux I/O models
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
1 of 14 7/15/2010 3:34 PM
-
Each of these I/O models has usage patterns that are
advantageous for particular applications. This section
briefly explores each one.
Synchronous blocking I/O
I/O-bound versus CPU-bound processes
A process that is I/O bound is one that performs more I/O than
processing. A CPU-bound process does more
processing than I/O. The Linux 2.6 scheduler actually favors
I/O-bound processes because they commonly
initiate an I/O and then block, which means other work can be
efficiently interlaced between them.
One of the most common models is the synchronous blocking I/O
model. In this model, the user-space
application performs a system call that results in the
application blocking. This means that the application
blocks until the system call is complete (data transferred or
error). The calling application is in a state where it
consumes no CPU and simply awaits the response, so it is
efficient from a processing perspective.
Figure 2 illustrates the traditional blocking I/O model, which
is also the most common model used in
applications today. Its behaviors are well understood, and its
usage is efficient for typical applications. When
the read system call is invoked, the application blocks and the
context switches to the kernel. The read is
then initiated, and when the response returns (from the device
from which you're reading), the data is moved
to the user-space buffer. Then the application is unblocked (and
the read call returns).
Figure 2. Typical flow of the synchronous blocking I/O model
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
2 of 14 7/15/2010 3:34 PM
-
From the application's perspective, the read call spans a long
duration. But, in fact, the application is actually
blocked while the read is multiplexed with other work in the
kernel.
Synchronous non-blocking I/O
A less efficient variant of synchronous blocking is synchronous
non-blocking I/O. In this model, a device is
opened as non-blocking. This means that instead of completing an
I/O immediately, a read may return an
error code indicating that the command could not be immediately
satisfied (EAGAIN or EWOULDBLOCK), as
shown in Figure 3.
Figure 3. Typical flow of the synchronous non-blocking I/O
model
The implication of non-blocking is that an I/O command may not
be satisfied immediately, requiring that the
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
3 of 14 7/15/2010 3:34 PM
-
application make numerous calls to await completion. This can be
extremely inefficient because in many
cases the application must busy-wait until the data is available
or attempt to do other work while the
command is performed in the kernel. As also shown in Figure 3,
this method can introduce latency in the I/O
because any gap between the data becoming available in the
kernel and the user calling read to return it can
reduce the overall data throughput.
Asynchronous blocking I/O
Another blocking paradigm is non-blocking I/O with blocking
notifications. In this model, non-blocking I/O is
configured, and then the blocking select system call is used to
determine when there's any activity for an I/O
descriptor. What makes the select call interesting is that it
can be used to provide notification for not just
one descriptor, but many. For each descriptor, you can request
notification of the descriptor's ability to write
data, availability of read data, and also whether an error has
occurred.
Figure 4. Typical flow of the asynchronous blocking I/O model
(select)
The primary issue with the select call is that it's not very
efficient. While it's a convenient model for
asynchronous notification, its use for high-performance I/O is
not advised.
Asynchronous non-blocking I/O (AIO)
Finally, the asynchronous non-blocking I/O model is one of
overlapping processing with I/O. The read request
returns immediately, indicating that the read was successfully
initiated. The application can then perform
other processing while the background read operation completes.
When the read response arrives, a signal or
a thread-based callback can be generated to complete the I/O
transaction.
Figure 5. Typical flow of the asynchronous non-blocking I/O
model
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
4 of 14 7/15/2010 3:34 PM
-
The ability to overlap computation and I/O processing in a
single process for potentially multiple I/O requests
exploits the gap between processing speed and I/O speed. While
one or more slow I/O requests are pending,
the CPU can perform other tasks or, more commonly, operate on
already completed I/Os while other I/Os are
initiated.
The next section examines this model further, explores the API,
and then demonstrates a number of the
commands.
Motivation for asynchronous I/O
From the previous taxonomy of I/O models, you can see the
motivation for AIO. The blocking models require
the initiating application to block when the I/O has started.
This means that it isn't possible to overlap
processing and I/O at the same time. The synchronous
non-blocking model allows overlap of processing and
I/O, but it requires that the application check the status of
the I/O on a recurring basis. This leaves
asynchronous non-blocking I/O, which permits overlap of
processing and I/O, including notification of I/O
completion.
The functionality provided by the select function (asynchronous
blocking I/O) is similar to AIO, except that
it still blocks. However, it blocks on notifications instead of
the I/O call.
Introduction to AIO for Linux
This section explores the asynchronous I/O model for Linux to
help you understand how to apply it in your
applications.
In a traditional I/O model, there is an I/O channel that is
identified by a unique handle. In UNIX, these are
file descriptors (which are the same for files, pipes, sockets,
and so on). In blocking I/O, you initiate a transfer
and the system call returns when it's complete or an error has
occurred.
AIO for Linux
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
5 of 14 7/15/2010 3:34 PM
-
AIO first entered the Linux kernel in 2.5 and is now a standard
feature of 2.6 production kernels.
In asynchronous non-blocking I/O, you have the ability to
initiate multiple transfers at the same time. This
requires a unique context for each transfer so you can identify
it when it completes. In AIO, this is an aiocb
(AIO I/O Control Block) structure. This structure contains all
of the information about a transfer, including a
user buffer for data. When notification for an I/O occurs
(called a completion), the aiocb structure is
provided to uniquely identify the completed I/O. The API
demonstration shows how to do this.
AIO API
The AIO interface API is quite simple, but it provides the
necessary functions for data transfer with a couple
of different notification models. Table 1 shows the AIO
interface functions, which are further explained later
in this section.
Table 1. AIO interface APIs
API functionDescription
aio_read Request an asynchronous read operation
aio_error Check the status of an asynchronous request
aio_return Get the return status of a completed asynchronous
request
aio_write Request an asynchronous operation
aio_suspendSuspend the calling process until one or more
asynchronous requests have completed (or failed)
aio_cancel Cancel an asynchronous I/O request
lio_listio Initiate a list of I/O operations
Each of these API functions uses the aiocb structure for
initiating or checking. This structure has a number
of elements, but Listing 1 shows only the ones that you'll need
to (or can) use.
Listing 1. The aiocb structure showing the relevant fields
struct aiocb {
int aio_fildes; // File Descriptor int aio_lio_opcode; // Valid
only for lio_listio (r/w/nop) volatile void *aio_buf; // Data
Buffer size_t aio_nbytes; // Number of Bytes in Data Buffer struct
sigevent aio_sigevent; // Notification Structure
/* Internal fields */ ...
};
The sigevent structure tells AIO what to do when the I/O
completes. You'll explore this structure in the AIO
demonstration. Now I'll show you how the individual API
functions for AIO work and how you can use them.
aio_read
The aio_read function requests an asynchronous read operation
for a valid file descriptor. The file descriptor
can represent a file, a socket, or even a pipe. The aio_read
function has the following prototype:
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
6 of 14 7/15/2010 3:34 PM
-
int aio_read( struct aiocb *aiocbp );
The aio_read function returns immediately after the request has
been queued. The return value is zero on
success or -1 on error, where errno is defined.
To perform a read, the application must initialize the aiocb
structure. The following short example illustrates
filling in the aiocb request structure and using aio_read to
perform an asynchronous read request (ignore
notification for now). It also shows use of the aio_error
function, but I'll explain that later.
Listing 2. Sample code for an asynchronous read with
aio_read
#include
...
int fd, ret; struct aiocb my_aiocb;
fd = open( "file.txt", O_RDONLY ); if (fd < 0)
perror("open");
/* Zero out the aiocb structure (recommended) */ bzero( (char
*)&my_aiocb, sizeof(struct aiocb) );
/* Allocate a data buffer for the aiocb request */
my_aiocb.aio_buf = malloc(BUFSIZE+1); if (!my_aiocb.aio_buf)
perror("malloc");
/* Initialize the necessary fields in the aiocb */
my_aiocb.aio_fildes = fd; my_aiocb.aio_nbytes = BUFSIZE;
my_aiocb.aio_offset = 0;
ret = aio_read( &my_aiocb ); if (ret < 0)
perror("aio_read");
while ( aio_error( &my_aiocb ) == EINPROGRESS ) ;
if ((ret = aio_return( &my_iocb )) > 0) { /* got ret
bytes on the read */ } else { /* read failed, consult errno */
}
In Listing 2, after the file from which you're reading data is
opened, you zero out your aiocb structure, and
then allocate a data buffer. The reference to the data buffer is
placed into aio_buf. Subsequently, you
initialize the size of the buffer into aio_nbytes. The
aio_offset is set to zero (the first offset in the file).
You set the file descriptor from which you're reading into
aio_fildes. After these fields are set, you call
aio_read to request the read. You can then make a call to
aio_error to determine the status of the
aio_read. As long as the status is EINPROGRESS, you busy-wait
until the status changes. At this point, your
request has either succeeded or failed.
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
7 of 14 7/15/2010 3:34 PM
-
Building with the AIO interface
You can find the function prototypes and other necessary
symbolics in the aio.h header file. When building
an application that uses this interface, you must use the POSIX
real-time extensions library (librt).
Note the similarities to reading from the file with the standard
library functions. In addition to the
asynchronous nature of aio_read, another difference is setting
the offset for the read. In a typical read call,
the offset is maintained for you in the file descriptor context.
For each read, the offset is updated so that
subsequent reads address the next block of data. This isn't
possible with asynchronous I/O because you can
perform many read requests simultaneously, so you must specify
the offset for each particular read request.
aio_error
The aio_error function is used to determine the status of a
request. Its prototype is:
int aio_error( struct aiocb *aiocbp );
This function can return the following:
EINPROGRESS, indicating the request has not yet completed
ECANCELLED, indicating the request was cancelled by the
application
-1, indicating that an error occurred for which you can consult
errno
aio_return
Another difference between asynchronous I/O and standard
blocking I/O is that you don't have immediate
access to the return status of your function because you're not
blocking on the read call. In a standard read
call, the return status is provided upon return of the function.
With asynchronous I/O, you use the
aio_return function. This function has the following
prototype:
ssize_t aio_return( struct aiocb *aiocbp );
This function is called only after the aio_error call has
determined that your request has completed (either
successfully or in error). The return value of aio_return is
identical to that of the read or write system call
in a synchronous context (number of bytes transferred or -1 for
error).
aio_write
The aio_write function is used to request an asynchronous write.
Its function prototype is:
int aio_write( struct aiocb *aiocbp );
The aio_write function returns immediately, indicating that the
request has been enqueued (with a return of
0 on success and -1 on failure, with errno properly set).
This is similar to the read system call, but one behavior
difference is worth noting. Recall that the offset to be
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
8 of 14 7/15/2010 3:34 PM
-
used is important with the read call. However, with write, the
offset is important only if used in a file
context where the O_APPEND option is not set. If O_APPEND is
set, then the offset is ignored and the data is
appended to the end of the file. Otherwise, the aio_offset field
determines the offset at which the data is
written to the file.
aio_suspend
You can use the aio_suspend function to suspend (or block) the
calling process until an asynchronous I/O
request has completed, a signal is raised, or an optional
timeout occurs. The caller provides a list of aiocb
references for which the completion of at least one will cause
aio_suspend to return. The function prototype
for aio_suspend is:
int aio_suspend( const struct aiocb *const cblist[], int n,
const struct timespec *timeout );
Using aio_suspend is quite simple. A list of aiocb references is
provided. If any of them complete, the call
returns with 0. Otherwise, -1 is returned, indicating an error
occurred. See Listing 3.
Listing 3. Using the aio_suspend function to block on
asynchronous I/Os
struct aioct *cblist[MAX_LIST]
/* Clear the list. */bzero( (char *)cblist, sizeof(cblist)
);
/* Load one or more references into the list */cblist[0] =
&my_aiocb;
ret = aio_read( &my_aiocb );
ret = aio_suspend( cblist, MAX_LIST, NULL );
Note that the second argument of aio_suspend is the number of
elements in cblist, not the number of
aiocb references. Any NULL element in the cblist is ignored by
aio_suspend.
If a timeout is provided to aio_suspend and the timeout occurs,
then -1is returned and errno contains
EAGAIN.
aio_cancel
The aio_cancel function allows you to cancel one or all
outstanding I/O requests for a given file descriptor.
Its prototype is:
int aio_cancel( int fd, struct aiocb *aiocbp );
To cancel a single request, provide the file descriptor and the
aiocb reference. If the request is successfully
cancelled, the function returns AIO_CANCELED. If the request
completes, the function returns
AIO_NOTCANCELED.
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
9 of 14 7/15/2010 3:34 PM
-
To cancel all requests for a given file descriptor, provide that
file descriptor and a NULL reference for aiocbp.
The function returns AIO_CANCELED if all requests are canceled,
AIO_NOT_CANCELED if at least one request
couldn't be canceled, and AIO_ALLDONE if none of the requests
could be canceled. You can then evaluate
each individual AIO request using aio_error. If the request was
canceled, aio_error returns -1, and errno
is set to ECANCELED.
lio_listio
Finally, AIO provides a way to initiate multiple transfers at
the same time using the lio_listio API
function. This function is important because it means you can
start lots of I/Os in the context of a single
system call (meaning one kernel context switch). This is great
from a performance perspective, so it's worth
exploring. The lio_listio API function has the following
prototype:
int lio_listio( int mode, struct aiocb *list[], int nent, struct
sigevent *sig );
The mode argument can be LIO_WAIT or LIO_NOWAIT. LIO_WAIT blocks
the call until all I/O has completed.
LIO_NOWAIT returns after the operations have been queued. The
list is a list of aiocb references, with the
maximum number of elements defined by nent. Note that elements
of list may be NULL, which lio_listio
ignores. The sigevent reference defines the method for signal
notification when all I/O is complete.
The request for lio_listio is slightly different than the
typical read or write request in that the operation
must be specified. This is illustrated in Listing 4.
Listing 4. Using the lio_listio function to initiate a list of
requests
struct aiocb aiocb1, aiocb2;struct aiocb *list[MAX_LIST];
...
/* Prepare the first aiocb */aiocb1.aio_fildes =
fd;aiocb1.aio_buf = malloc( BUFSIZE+1 );aiocb1.aio_nbytes =
BUFSIZE;aiocb1.aio_offset = next_offset;aiocb1.aio_lio_opcode =
LIO_READ;
...
bzero( (char *)list, sizeof(list) );list[0] =
&aiocb1;list[1] = &aiocb2;
ret = lio_listio( LIO_WAIT, list, MAX_LIST, NULL );
The read operation is noted in the aio_lio_opcode field with
LIO_READ. For a write operation, LIO_WRITE is
used, but LIO_NOP is also valid for no operation.
AIO notifications
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
10 of 14 7/15/2010 3:34 PM
-
Now that you've seen the AIO functions that are available, this
section digs into the methods that you can use
for asynchronous notification. I'll explore asynchronous
notification through signals and function callbacks.
Asynchronous notification with signals
The use of signals for interprocess communication (IPC) is a
traditional mechanism in UNIX and is also
supported by AIO. In this paradigm, the application defines a
signal handler that is invoked when a specified
signal occurs. The application then specifies that an
asynchronous request will raise a signal when the request
has completed. As part of the signal context, the particular
aiocb request is provided to keep track of
multiple potentially outstanding requests. Listing 5
demonstrates this notification method.
Listing 5. Using signals as notification for AIO requests
void setup_io( ... ){ int fd; struct sigaction sig_act; struct
aiocb my_aiocb;
...
/* Set up the signal handler */
sigemptyset(&sig_act.sa_mask); sig_act.sa_flags = SA_SIGINFO;
sig_act.sa_sigaction = aio_completion_handler;
/* Set up the AIO request */ bzero( (char *)&my_aiocb,
sizeof(struct aiocb) ); my_aiocb.aio_fildes = fd; my_aiocb.aio_buf
= malloc(BUF_SIZE+1); my_aiocb.aio_nbytes = BUF_SIZE;
my_aiocb.aio_offset = next_offset;
/* Link the AIO request with the Signal Handler */
my_aiocb.aio_sigevent.sigev_notify = SIGEV_SIGNAL;
my_aiocb.aio_sigevent.sigev_signo = SIGIO;
my_aiocb.aio_sigevent.sigev_value.sival_ptr = &my_aiocb;
/* Map the Signal to the Signal Handler */ ret = sigaction(
SIGIO, &sig_act, NULL );
...
ret = aio_read( &my_aiocb );
}
void aio_completion_handler( int signo, siginfo_t *info, void
*context ){ struct aiocb *req;
/* Ensure it's our signal */ if (info->si_signo == SIGIO)
{
req = (struct aiocb *)info->si_value.sival_ptr;
/* Did the request complete? */ if (aio_error( req ) == 0) {
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
11 of 14 7/15/2010 3:34 PM
-
/* Request completed successfully, get the return status */ ret
= aio_return( req );
}
}
return;}
In Listing 5, you set up your signal handler to catch the SIGIO
signal in the aio_completion_handler
function. You then initialize the aio_sigevent structure to
raise SIGIO for notification (which is specified via
the SIGEV_SIGNAL definition in sigev_notify). When your read
completes, your signal handler extracts the
particular aiocb from the signal's si_value structure and checks
the error status and return status to
determine I/O completion.
For performance, the completion handler is an ideal spot to
continue the I/O by requesting the next
asynchronous transfer. In this way, when completion of one
transfer has completed, you immediately start the
next.
Asynchronous notification with callbacks
An alternative notification mechanism is the system callback.
Instead of raising a signal for notification, this
mechanism calls a function in user-space for notification. You
initialize the aiocb reference into the
sigevent structure to uniquely identify the particular request
being completed; see Listing 6.
Listing 6. Using thread callback notification for AIO
requests
void setup_io( ... ){ int fd; struct aiocb my_aiocb;
...
/* Set up the AIO request */ bzero( (char *)&my_aiocb,
sizeof(struct aiocb) ); my_aiocb.aio_fildes = fd; my_aiocb.aio_buf
= malloc(BUF_SIZE+1); my_aiocb.aio_nbytes = BUF_SIZE;
my_aiocb.aio_offset = next_offset;
/* Link the AIO request with a thread callback */
my_aiocb.aio_sigevent.sigev_notify = SIGEV_THREAD;
my_aiocb.aio_sigevent.notify_function = aio_completion_handler;
my_aiocb.aio_sigevent.notify_attributes = NULL;
my_aiocb.aio_sigevent.sigev_value.sival_ptr = &my_aiocb;
...
ret = aio_read( &my_aiocb );
}
void aio_completion_handler( sigval_t sigval ){
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
12 of 14 7/15/2010 3:34 PM
-
struct aiocb *req;
req = (struct aiocb *)sigval.sival_ptr;
/* Did the request complete? */ if (aio_error( req ) == 0) {
/* Request completed successfully, get the return status */ ret
= aio_return( req );
}
return;}
In Listing 6, after creating your aiocb request, you request a
thread callback using SIGEV_THREAD for the
notification method. You then specify the particular
notification handler and load the context to be passed to
the handler (in this case, a reference to the aiocb request
itself). In the handler, you simply cast the incoming
sigval pointer and use the AIO functions to validate the
completion of the request.
System tuning for AIO
The proc file system contains two virtual files that can be
tuned for asynchronous I/O performance:
The /proc/sys/fs/aio-nr file provides the current number of
system-wide asynchronous I/O requests.
The /proc/sys/fs/aio-max-nr file is the maximum number of
allowable concurrent requests. The
maximum is commonly 64KB, which is adequate for most
applications.
Summary
Using asynchronous I/O can help you build faster and more
efficient I/O applications. If your application can
overlap processing and I/O, then AIO can help you build an
application that more efficiently uses the CPU
resources available to you. While this I/O model differs from
the traditional blocking patterns found in most
Linux applications, the asynchronous notification model is
conceptually simple and can simplify your design.
Resources
Learn
The POSIX.1b implementation explains the internal details of AIO
from the GNU Library perspective.
Realtime Support in Linux explains more about AIO and a number
of real-time extensions, from
scheduling and POSIX I/O to POSIX threads and high resolution
timers (HRT).
In the Design Notes for the 2.5 integration, learn about the
design and implementation of AIO in Linux.
In the developerWorks Linux zone, find more resources for Linux
developers.
Stay current with developerWorks technical events and
Webcasts.
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
13 of 14 7/15/2010 3:34 PM
-
Get products and technologies
With IBM trial software, available for download directly from
developerWorks, build your next
development project on Linux.
Discuss
Check out developerWorks blogs and get involved in the
developerWorks community.
About the author
M. Tim Jones is an embedded software architect and the author of
GNU/Linux Application Programming, AI
Application Programming, and BSD Sockets Programming from a
Multilanguage Perspective. His
engineering background ranges from the development of kernels
for geosynchronous spacecraft to embedded
systems architecture and networking protocols development. Tim
is a Consultant Engineer for Emulex Corp.
in Longmont, Colorado.
Trademarks | My developerWorks terms and conditions
Boost application performance using asynchronous I/O
http://www.ibm.com/developerworks/linux/library/l-async/index.html
14 of 14 7/15/2010 3:34 PM