
Where does the system restricts fiel size?
Solaris 2.5.1 has a limit on filesize. No bigger than 2Gb. Solaris 2.6
and Solaris 7 have a limit much higher, very much higher at 1 terabyte.
see
http://www.sunworld.com/sunworldonline/swol-07-1998/swol-07-insidesol...
[
swol-07-insidesolaris.html 44K ]
Asynchronous I/O and large file support in Solaris
Delve into asynchronous I/O facilities and 64-bit file support in Solaris 2.6 and beyond
|
|
Abstract
Asynchronous I/O interfaces have been available in Solaris for some time,
providing a means by which applications could issue I/O requests and not
have to "block" or cease working until the I/O was completed.
64-bit file support was added to the asynchronous I/O interfaces before the
full-blown large file support that came with Solaris 2.6.
With Solaris 2.6, file sizes in Solaris are no longer limited to a maximum size of 2 gigabytes. In compliance with the specifications established by the Large File Summit, a number of changes have been made in the kernel, including extensions to the file APIs and shell commands for the implementation of large files.
This month, Jim examines the asynchronous I/O facilities and 64-bit file implementation in Solaris. (4,200 words)
Mail this article to a friend
|
he first 64-bit file I/O interfaces found their way into Solaris in
the 2.5.1 release, with the introduction of just two new read and
write APIs: aioread64(3) and aiowrite64(3),
which are extended versions of the aioread(3) and aiowrite(3)
asynchronous I/O (aio) routines that have been available in Solaris
for some time. The goal was to provide relational database
vendors a facility for asynchronous (async) I/O on raw devices that weren't
limited to 2 gigabytes in size. Since a vast majority of Sun servers run some form
of database application, it made sense to get 64-bit file support
out the door early for these applications.
Async I/O routines provide the ability to do real asynchronous I/O in an application.
This is accomplished by allowing the calling process or thread to continue processing after issuing a read or write and receive notification either upon completion of the I/O operation, or of an error
condition that prevented the I/O from being completed. This is because the routine calling either aioread(3) or aiowrite(3) is required to pass, as one of the required arguments, a pointer to an aio_result structure. The aio_result structure has two structure members: aio_return and aio_errno. The system uses these to set the return value of
the call or, in the case of an error, the errno, or error number. From /usr/include/sys/aio.h:
typedef struct aio_result_t {
int aio_return; /* return value of read or write */
int aio_errno; /* errno generated by the IO */
} aio_result_t;
Two different sets of interfaces exist to do async I/O in
Solaris: The aforementioned aioread(3) and aiowrite(3) routines, and
the POSIX-equivalent routines, aio_read(3R) and aio_write(3R), which are based
on the POSIX standards for realtime extensions. Realtime applications must, by definition,
deal with an unpredictable flow of external interrupt conditions that require predictable,
bounded response times. In order to meet that requirement, a
complete non-blocking I/O facility is needed. This is where
asynchronous I/O comes in, as these interfaces can meet the requirements of
most realtime applications. The POSIX and Solaris asynchronous I/O
interfaces are functionally identical. The real differences exist in
the semantics of using one interface or the other. This month's column will
provide information that is applicable to both sets of interfaces.
Asynchronous I/O is implemented using the lwp (light-weight process) system calls, which are the lower level implementation of the user-level threads library. Multithreaded applications
can be developed in Solaris using either Solaris threads (e.g., thr_create(3T)
to create a new thread within a process) or POSIX threads (e.g., pthread_create(3T)
to create new threads). Both the Solaris and POSIX thread interfaces are library
routines that do some basic housekeeping functions in user-mode
before entering the kernel through the system call interface. The
system calls that ultimately get executed for threads are the
_lwp_xxxx(2) routines (for example, thr_create(3T) and pthread_create(3T),
which should enter the kernel via _lwp_create(2), the lower level
interface. It's possible to use the _lwp_xxxx(2) calls directly
from your program, but these routines are more difficult to use and they break code
portability. (That's why we have library routines.) (We'll be covering the topic of processes,
threads, and lwps in Solaris in a future Inside Solaris
column.)
Anyway, back to async I/O. As I said, the original implementation of
the aioread(3) and aiowrite(3) routines creates a queue of I/O
requests and processes them through user-level threads. When the
aioread(3) or aiowrite(3) is entered, the system will simply put the
I/O in a queue and create an lwp (a thread) to do the I/O. The lwp
returns when the I/O is complete (or when an error occurs), and the
calling process is notified via a special signal, SIGIO. It's
up to you to put a signal handler in place to receive the
SIGIO and take appropriate action, which minimally includes checking
the return status of the read or write by reading the aio_result
structure's aio_return value. As an alternative to the signal-based
SIGIO notification, you have the option of calling
aiowait(3) after issuing an aioread(3) or aiowrite(3). This will
cause the calling thread to block until the pending async I/O has
completed. There's a time-value that can be set and passed as an
argument to aiowait(3) such that the system only waits for a specified
amount of time.
While the threads library implementation of async I/O works well
enough for many applications, it didn't necessarily provide optimal
performance for applications that made heavy use of the async I/O
facilities. Commercial relational database systems, for example,
use the async I/O interfaces extensively. Overhead
associated with the creation, management, and scheduling of user
threads motivated the decision that an implementation that required
less overhead and provided better performance and scalability was in
order. A review of the existing async I/O architecture and
subsequent engineering effort resulted in an implementation called
kernel asynchronous I/O, or kaio.
Kaio first appeared in Solaris 2.4 (with a handful of required patches) and
has been available, with some restrictions, in every Solaris release
since. The restrictions have to do with which devices and
software include kaio support and which ones don't. The good news, from an application
standpoint, is that the question of whether or not there is kaio support for a
given combination of storage devices, volume managers, and file systems is transparent:
If kaio support exists, it will be used. If it doesn't, the original
library-based async I/O will be used. Applications don't change in
order to take advantage of kaio. The system figures out what is
available and allocates accordingly.
What kaio does, as the name implies, is implement async I/O
inside the kernel rather than in user-land via user threads. The I/O
queue is created and managed
...
read more »