Because of the simplicity of information Atropos exposes
to applications, the interface to Atropos can be
readily implemented with small extensions to the commands
already defined in the SCSI protocol. The parameters
p and w could be exposed in a new mode page
returned by the MODE SENSE SCSI command. To ensure
that Atropos executes all requests to non-contiguous
VLBNs for the other-major access together, an application
can link the appropriate requests. To do so, the
READ or WRITE commands for semi-sequential access
are issued with the Link bit set.
3.3.4 Implementation details
Our Atropos logical volume manager implementation
is a stand-alone process that accepts I/O requests via
a socket. It issues individual disk I/Os directly to the
attached SCSI disks using the Linux raw SCSI device
/dev/sg. With an SMP host, the process can run on a
separate CPU of the same host, to minimize the effect on
the execution of the main application.
An application using Atropos is linked with a stub library
providing API functions for reading and writing.
The library uses shared memory to avoid data copies and
communicates through the socket with the Atropos LVM
process. The Atropos LVM organization is specified by
a configuration file, which functions in lieu of a format
command. The file lists the number of disks, p, the desired
block size, b, and the list of disks to be used.
For convenience, the interface stub also includes three
functions. The function get boundaries(LBN) returns
the stripe unit boundaries between which the given LBN
falls. Hence, these boundaries form a collection of w
contiguous LBNs for constructing efficient I/Os. The
get rectangle(LBN) function returns the wp contiguous
LBNs in a single row across all disks. These functions
are just convenient wrappers that calculate the proper
LBNs from the w and p parameters. Finally, the stub
interface also includes a batch() function to explicitly
group READ and WRITE commands (e.g., for semi-sequential
access).
With no outstanding requests in the queue (i.e., the
disk is idle), current SCSI disks will immediately schedule
the first received request of batch, even though it may
not be the one with the smallest rotational latency. This
diminishes the effectiveness of semi-sequential access.
To overcome this problem, our Atropos implementation
“pre-schedules” the batch of requests by sending first the
request that will incur the smallest rotational latency. It
uses known techniques for SPTF scheduling outside of
disk firmware [14]. With the help of a detailed and validated
model of the disk mechanics [2, 21], the disk head
position is deduced from the location and time of the
last-completed request. If disks waited for all requests
of a batch before making a scheduling decision, this prescheduling
would not be necessary.
Our implementation of the Atropos logical volume
manager is about 2000 lines of C++ code and includes
implementations of RAID levels 0 and 1. Another 600
lines of C code implement methods for automatically extracting
track boundaries and head switch time [22, 26].
4 Efficient access in database systems
Efficient access to database tables in both dimensions
can significantly improve performance of a variety of
queries doing selective table scans. These queries can request
(i) a subset of columns (restricting access along the
primary dimension, if the order is column-major), which
is prevalent in decision support workloads (TPC-H), (ii)
a subset of rows (restricting access along the secondary
dimension), which is prevalent in online transaction processing
(TPC-C), or (iii) a combination of both.
A companion project [24] to Atropos extends the
Shore database storage manager [3] to support a page
layout that takes advantage of Atropos’s efficient accesses
in both dimensions. The page layout is based
on a cache-efficient page layout, called PAX [1], which
extends the NSM page layout to group values of a single
attribute into units called “minipages”. Minipages in
PAX exist to take advantage of CPU cache prefetchers
to minimize cache misses during single-attribute memory
accesses. We use minipages as well, but they are
aligned and sized to fit into one or more 512 byte LBNs,
depending on the relative sizes of the attributes within a
single page.
The mapping of 8 KB pages onto the quadrangles
of the Atropos logical volume is depicted in Figure 6.
A single page contains 16 equally-sized attributes, labeled
A1–A16, where each attribute is stored in a separate
minipage that maps to a single VLBN. Accessing a
single page is thus done by issuing 16 batched requests
to every 16th (or more generally, wp-th) VLBN. Internally,
the VLBNs comprising this page are mapped diagonally
to the blocks marked with the dashed arrow.
Hence, 4 semi-sequential accesses proceeding in parallel
can fetch the entire page (i.e., row-major order access).
Individual minipages are mapped across sequential
runs of VLBNs. For example, to fetch attribute A1 for
records 0–399, the database storage manager can issue
one efficient sequential I/O to fetch the appropriate minipages.
Atropos breaks this I/O into four efficient, trackbased
disk accesses proceeding in parallel. The database
storage manager then reassembles these minipages into
appropriate 8 KB pages [24].
Links

RAID data recovery, Mac data recovery, Unix data recovery, Linux data recovery, Oracle data recovery, CD data recovery, Zip data recovery, DVD data recovery , Flash data recovery, Laptop data recovery, PDA data recovery, Ipaq data recovery, Maxtor HDD, Hitachi HDD, Fujitsi HDD, Seagate HDD, Hewlett-Packard HDD, HP HDD, IBM HDD, MP3 data recovery, DVD data recovery, CD-RW data recovery, DAT data recovery, Smartmedia data recovery, Network data recovery, Lost data recovery, Back-up expert data recovery, Tape data recovery, NTFS data recovery, FAT 16 data recovery, FAT 32 data recovery, Novell data recovery, Recovery tool data recovery, Compact flash data recovery, Hard drive data recovery, IDE data recovery, SCSI data recovery, Deskstar data recovery, Maxtor data recovery, Fujitsu HDD data recovery, Samsung data recovery, IBM data recovery, Seagate data recovery, Hitachi data recovery, Western Digital data recovery, Quantum data recovery, Microdrives data recovery, Easy Recovery, Recover deleted data , Data Recovery, Data Recovery Software, Undelete data, Recover, Recovery, Restore data, Unerase deleted data, unformat, Deleted, Data Destorer, fat recovery, Data, Recovery Software, File recovery, Drive Recovery, Recovery Disk , Easy data recovery, Partition recovery, Data Recovery Program, File Recovery, Disaster Recovery, Undelete File, Hard Disk Rrecovery, Win95 Data Recovery, Win98 Data Recovery, WinME data recovery, WinNT 4.x data recovery, WinXP data recovery, Windows2000 data recovery, System Utilities data recovery, File data recovery, Disk Management recovery, BitMart 2000 data recovery, Hard Drive Data Recovery, CompactFlash I, CompactFlash II, CF Compact Flash Type I Card,CF Compact Flash Type II Card, MD Micro Drive Card, XD Picture Card, SM Smart Media Card, MMC I Multi Media Type I Card, MMC II Multi Media Type II Card, RS-MMC Reduced Size Multi Media Card, SD Secure Digital Card, Mini SD Mini Secure Digital Card, TFlash T-Flash Card, MS Memory Stick Card, MS DUO Memory Stick Duo Card, MS PRO Memory Stick PRO Card, MS PRO DUO Memory Stick PRO Duo Card, MS Memory Stick Card MagicGate, MS DUO Memory Stick Duo Card MagicGate, MS PRO Memory Stick PRO Card MagicGate, MS PRO DUO Memory Stick PRO Duo Card MagicGate, MicroDrive Card and TFlash Memory Cards, Digital Camera Memory Card, RS-MMC, ATAPI Drive, JVC JY-HD10U, Secured Data Deletion, IT Security Firewall & Antiviruses, PocketPC Recocery, System File Recovery , RAID