mdadm - Unix, Linux Command
NAME
mdadm - manage MD devices
aka Linux Software RAID
SYNOPSIS
mdadm [mode] <raiddevice> [options] <component-devices>
DESCRIPTION
RAID devices are virtual devices created from two or more
real block devices. This allows multiple devices (typically disk
drives or partitions thereof) to be combined into a single device to
hold (for example) a single filesystem.
Some RAID levels include redundancy and so can survive some degree of
device failure.
Linux Software RAID devices are implemented through the md (Multiple
Devices) device driver.
Currently, Linux supports
LINEAR md devices,
RAID0 (striping),
RAID1 (mirroring),
RAID4, RAID5, RAID6, RAID10, MULTIPATH, and
FAULTY.
MULTIPATH is not a Software RAID mechanism, but does involve
multiple devices:
each device is a path to one common physical storage device.
FAULTY is also not true RAID, and it only involves one device. It
provides a layer over a true device that can be used to inject faults.
MODES
mdadm has several major modes of operation:
Tag | Description |
Assemble | |
Assemble the components of a previously created
array into an active array. Components can be explicitly given
or can be searched for.
mdadm checks that the components
do form a bona fide array, and can, on request, fiddle superblock
information so as to assemble a faulty array.
|
Build |
Build an array that doesnt have per-device superblocks. For these
sorts of arrays,
mdadm cannot differentiate between initial creation and subsequent assembly
of an array. It also cannot perform any checks that appropriate
components have been requested. Because of this, the
Build mode should only be used together with a complete understanding of
what you are doing.
|
Create |
Create a new array with per-device superblocks.
|
Follow or Monitor | |
Monitor one or more md devices and act on any state changes. This is
only meaningful for raid1, 4, 5, 6, 10 or multipath arrays, as
only these have interesting state. raid0 or linear never have
missing, spare, or failed drives, so there is nothing to monitor.
|
Grow |
Grow (or shrink) an array, or otherwise reshape it in some way.
Currently supported growth options include changing the active size
of component devices and changing the number of active devices in RAID
levels 1/4/5/6, as well as adding or removing a write-intent bitmap.
|
Incremental Assembly | |
Add a single device to an appropriate array. If the addition of the
device makes the array runnable, the array will be started.
This provides a convenient interface to a
hot-plug system. As each device is detected,
mdadm has a chance to include it in some array as appropriate.
|
Manage |
This is for doing things to specific components of an array such as
adding new spares and removing faulty devices.
|
Misc |
This is an everything else mode that supports operations on active
arrays, operations on component devices such as erasing old superblocks, and
information gathering operations.
|
Auto-detect | |
This mode does not act on a specific device or array, but rather it
requests the Linux Kernel to activate any auto-detected arrays.
|
OPTIONS
Options for selecting a mode are:
Tag | Description |
-A, --assemble | |
Assemble a pre-existing array.
|
-B, --build | |
Build a legacy array without superblocks.
|
-C, --create | |
Create a new array.
|
-F, --follow, --monitor | |
Select
Monitor mode.
|
-G, --grow | |
Change the size or shape of an active array.
|
-I, --incremental | |
Add a single device into an appropriate array, and possibly start the array.
|
--auto-detect | |
Request that the kernel starts any auto-detected arrays. This can only
work if
md is compiled into the kernel not if it is a module.
Arrays can be auto-detected by the kernel if all the components are in
primary MS-DOS partitions with partition type
FD. In-kernel autodetect is not recommended for new installations. Using
mdadm to detect and assemble arrays possibly in an
initrd is substantially more flexible and should be preferred.
|
If a device is given before any options, or if the first option is
--add, --fail, or
--remove, then the MANAGE mode is assumed.
Anything other than these will cause the
Misc mode to be assumed.
Options that are not mode-specific are:
Tag | Description |
-h, --help | |
Display general help message or, after one of the above options, a
mode-specific help message.
|
--help-options | |
Display more detailed help about command line parsing and some commonly
used options.
|
-V, --version | |
Print version information for mdadm.
|
-v, --verbose | |
Be more verbose about what is happening. This can be used twice to be
extra-verbose.
The extra verbosity currently only affects
--detail --scan and
--examine --scan.
|
-q, --quiet | |
Avoid printing purely informative messages. With this,
mdadm will be silent unless there is something really important to report.
|
-b, --brief | |
Be less verbose. This is used with
--detail and
--examine. Using
--brief with
--verbose gives an intermediate level of verbosity.
|
-f, --force | |
Be more forceful about certain operations. See the various modes for
the exact meaning of this option in different contexts.
|
-c, --config= | |
Specify the config file. Default is to use
/etc/mdadm.conf, or if that is missing then
/etc/mdadm/mdadm.conf. If the config file given is
partitions then nothing will be read, but
mdadm will act as though the config file contained exactly
DEVICE partitions and will read
/proc/partitions to find a list of devices to scan.
If the word
none is given for the config file, then
mdadm will act as though the config file were empty.
|
-s, --scan | |
Scan config file or
/proc/mdstat for missing information.
In general, this option gives
mdadm permission to get any missing information (like component devices,
array devices, array identities, and alert destination) from the
configuration file (see previous option);
one exception is MISC mode when using
--detail or
--stop, in which case
--scan says to get a list of array devices from
/proc/mdstat.
|
-e , --metadata= | |
Declare the style of superblock (raid metadata) to be used. The
default is 0.90 for
--create, and to guess for other operations.
The default can be overridden by setting the
metadata value for the
CREATE keyword in
mdadm.conf.
Options are:
Tag | Description |
0, 0.90, default
|
Use the original 0.90 format superblock. This format limits arrays to
28 component devices and limits component devices of levels 1 and
greater to 2 terabytes.
|
1, 1.0, 1.1, 1.2
|
Use the new version-1 format superblock. This has few restrictions.
The different sub-versions store the superblock at different locations
on the device, either at the end (for 1.0), at the start (for 1.1) or
4K from the start (for 1.2).
|
|
--homehost= | |
This will override any
HOMEHOST setting in the config file and provides the identity of the host which
should be considered the home for any arrays.
When creating an array, the
homehost will be recorded in the superblock. For version-1 superblocks, it will
be prefixed to the array name. For version-0.90 superblocks, part of
the SHA1 hash of the hostname will be stored in the later half of the
UUID.
When reporting information about an array, any array which is tagged
for the given homehost will be reported as such.
When using Auto-Assemble, only arrays tagged for the given homehost
will be assembled.
|
For create, build, or grow:
Tag | Description |
-n, --raid-devices= | |
Specify the number of active devices in the array. This, plus the
number of spare devices (see below) must equal the number of
component-devices (including "missing" devices)
that are listed on the command line for
--create. Setting a value of 1 is probably
a mistake and so requires that
--force be specified first. A value of 1 will then be allowed for linear,
multipath, raid0 and raid1. It is never allowed for raid4 or raid5.
This number can only be changed using
--grow for RAID1, RAID5 and RAID6 arrays, and only on kernels which provide
necessary support.
|
-x, --spare-devices= | |
Specify the number of spare (eXtra) devices in the initial array.
The number of component devices listed
on the command line must equal the number of raid devices plus the
number of spare devices.
After initial array creation, new devices are added to the array using the
--add command. If you add devices in excess of the number needed for the array,
they are automatically treated as spare devices. For grow mode, it is
not possible to grow the number of spare devices, instead you need to
grow (or shrink) the number of active devices in the array. Spare devices
are handled automatically after initial array creation.
|
-z, --size= | |
Amount (in Kibibytes) of space to use from each drive in RAID level 1/4/5/6.
This must be a multiple of the chunk size, and must leave about 128Kb
of space at the end of the drive for the RAID superblock.
If this is not specified
(as it normally is not) the smallest drive (or partition) sets the
size, though if there is a variance among the drives of greater than 1%, a warning is
issued.
This value can be set with
--grow for RAID level 1/4/5/6. If the array was created with a size smaller
than the currently active drives, the extra space can be accessed
using
--grow. The size can be given as
max which means to choose the largest size that fits on all current drives.
|
-c, --chunk= | |
Specify chunk size in kibibytes. The default is 64.
|
--rounding= | |
Specify rounding factor for linear array (==chunk size)
|
-l, --level= | |
Set raid level. When used with
--create, options are: linear, raid0, 0, stripe, raid1, 1, mirror, raid4, 4,
raid5, 5, raid6, 6, raid10, 10, multipath, mp, faulty. Obviously some of these are synonymous.
When used with
--build, only linear, stripe, raid0, 0, raid1, multipath, mp, and faulty are valid.
Not yet supported with
--grow.
|
-p, --layout= | |
This option configures the fine details of data layout for raid5,
and raid10 arrays, and controls the failure modes for
faulty.
The layout of the raid5 parity block can be one of
left-asymmetric, left-symmetric, right-asymmetric, right-symmetric, la, ra, ls, rs. The default is
left-symmetric.
When setting the failure mode for level
faulty, the options are:
write-transient, wt, read-transient, rt, write-persistent, wp, read-persistent, rp, write-all, read-fixable, rf, clear, flush, none.
Each failure mode can be followed by a number, which is used as a period
between fault generation. Without a number, the fault is generated
once on the first relevant request. With a number, the fault will be
generated after that many requests, and will continue to be generated
every time the period elapses.
Multiple failure modes can be current simultaneously by using the
--grow option to set subsequent failure modes.
"clear" or "none" will remove any pending or periodic failure modes,
and "flush" will clear any persistent faults.
To set the parity with
--grow, the level of the array ("faulty")
must be specified before the fault mode is specified.
Finally, the layout options for RAID10 are one of n, o or f followed
by a small number. The default is n2. The supported options are:
n signals near copies. Multiple copies of one data block are at
similar offsets in different devices.
o signals offset copies. Rather than the chunks being duplicated
within a stripe, whole stripes are duplicated but are rotated by one
device so duplicate blocks are on different devices. Thus subsequent
copies of a block are in the next drive, and are one chunk further
down.
f signals far copies
(multiple copies have very different offsets).
See md(4) for more detail about near and far.
The number is the number of copies of each datablock. 2 is normal, 3
can be useful. This number can be at most equal to the number of
devices in the array. It does not need to divide evenly into that
number (e.g. it is perfectly legal to have an n2 layout for an array
with an odd number of devices).
|
--parity= | |
same as
--layout (thus explaining the p of
-p).
|
-b, --bitmap= | |
Specify a file to store a write-intent bitmap in. The file should not
exist unless
--force is also given. The same file should be provided
when assembling the array. The file may not reside on a filesystem that is
built on top of the array the bitmap file is for or else a kernel deadlock
will occur. This is not a bug, its a feature. If the word
internal is given, then the bitmap is stored with the metadata on the array,
and so is replicated on all devices. If the word
none is given with
--grow mode, then any bitmap that is present is removed.
To help catch typing errors, the filename must contain at least one
slash (/) if it is a real file (not internal or none).
Note: external bitmaps are only known to work on ext2 and ext3.
Storing bitmap files on other filesystems may result in serious problems.
Note: The choice of internal versus external bitmap can have a drastic impact
on performance.
While an internal bitmap is the most convenient as it doesnt require
a device totally separate from the array on which to store the bitmap
file, it has a larger impact on performance than an external bitmap.
This is because we cant predict which device in the array might fail, so
we store a copy of the bitmap on every device in the array when using an
internal bitmap. This means that prior to allowing a write to a section of
the array that is currently marked clean in the bitmap, we must issue a
write to change the bit for that section of the array from clean to dirty,
and must wait for the bitmap write to complete on all of the array devices
before the pending write to the array data area can proceed.
Especially if the array is under heavy load, these syncronous writes can
drastically impact performance. An external bitmap file is less convenient,
but there is only one copy of the bitmap, so there is only one bitmap write
that must complete before the pending write to the array data can proceed.
In addition, if your bitmap file device is not heavily loaded, and the
array is, then you will notice a considerable performance benefit
from the fact that bitmap writes are not competing with array reads/writes.
The performance impact of this option can be somewhat mitigated by
appropriate selection of a bitmap chunk size (next option).
|
--bitmap-chunk= | |
Set the chunksize of the bitmap. Each bit corresponds to that many
Kilobytes of storage.
When using a file based bitmap, the default is to use the smallest
size that is at-least 4 and requires no more than 2^21 chunks.
When using an
internal bitmap, the chunksize is automatically determined to make best use of
available space.
Note: This option can drastically effect performance of the array.
The more granular the bitmap is, then the more
frequently writes will trigger syncronous bitmap updates and be delayed
until the bitmap update is complete. The trade off is that a more
granular bitmap means a shorter array resync time after any event causes
the array to go down unclean. Given raw drive speeds can be in excess
of 100MB/s on modern SATA/SAS drives, any bitmap chunk up to 262144 (256MB)
can generally be synced in a matter of just a few seconds. Smaller chunks
can be synced faster, but you reach a point of diminishing returns that is
quickly offset by the increased write performance degradation seen in
every day operation. Considering that the smaller bitmap chunk sizes
will only ever be a benefit on rare occasions (hopefully never), but that
you will pay for a small bitmap chunk every single day, it is recommended
that you select the largest bitmap chunk size you feel comforable with.
|
-W, --write-mostly | |
subsequent devices lists in a
--build, --create, or
--add command will be flagged as write-mostly. This is valid for RAID1
only and means that the md driver will avoid reading from these
devices if at all possible. This can be useful if mirroring over a
slow link.
|
--write-behind= | |
Specify that write-behind mode should be enabled (valid for RAID1
only). If an argument is specified, it will set the maximum number
of outstanding writes allowed. The default value is 256.
A write-intent bitmap is required in order to use write-behind
mode, and write-behind is only attempted on drives marked as
write-mostly.
|
--assume-clean | |
Tell
mdadm that the array pre-existed and is known to be clean. It can be useful
when trying to recover from a major failure as you can be sure that no
data will be affected unless you actually write to the array. It can
also be used when creating a RAID1 or RAID10 if you want to avoid the
initial resync, however this practice while normally safe is not
recommended. Use this only if you really know what you are doing.
|
--backup-file= | |
This is needed when
--grow is used to increase the number of
raid-devices in a RAID5 if there are no spare devices available.
See the section below on RAID_DEVICE CHANGES. The file should be
stored on a separate device, not on the raid array being reshaped.
|
-N, --name= | |
Set a
name for the array. This is currently only effective when creating an
array with a version-1 superblock. The name is a simple textual
string that can be used to identify array components when assembling.
|
-R, --run | |
Insist that
mdadm run the array, even if some of the components
appear to be active in another array or filesystem. Normally
mdadm will ask for confirmation before including such components in an
array. This option causes that question to be suppressed.
|
-f, --force | |
Insist that
mdadm accept the geometry and layout specified without question. Normally
mdadm will not allow creation of an array with only one device, and will try
to create a raid5 array with one missing drive (as this makes the
initial resync work faster). With
--force, mdadm will not try to be so clever.
|
-a, --auto{=no,yes,md,mdp,part,p}{NN} | |
Instruct mdadm to create the device file if needed, possibly allocating
an unused minor number. "md" causes a non-partitionable array
to be used. "mdp", "part" or "p" causes a partitionable array (2.6 and
later) to be used. "yes" requires the named md device to have
a standard format, and the type and minor number will be determined
from this. See DEVICE NAMES below.
The argument can also come immediately after
"-a". e.g. "-ap".
If
--auto is not given on the command line or in the config file, then
the default will be
--auto=yes.
If
--scan is also given, then any
auto= entries in the config file will override the
--auto instruction given on the command line.
For partitionable arrays,
mdadm will create the device file for the whole array and for the first 4
partitions. A different number of partitions can be specified at the
end of this option (e.g.
--auto=p7). If the device name ends with a digit, the partition names add a p,
and a number, e.g. "/dev/home1p3". If there is no
trailing digit, then the partition names just have a number added,
e.g. "/dev/scratch3".
If the md device name is in a standard format as described in DEVICE
NAMES, then it will be created, if necessary, with the appropriate
number based on that name. If the device name is not in one of these
formats, then a unused minor number will be allocated. The minor
number will be considered unused if there is no active array for that
number, and there is no entry in /dev for that number and with a
non-standard name.
|
--symlink=no | |
Normally when
--auto causes
mdadm to create devices in
/dev/md/ it will also create symlinks from
/dev/ with names starting with
md or
md_. Use
--symlink=no to suppress this, or
--symlink=yes to enforce this even if it is suppressing
mdadm.conf.
|
For assemble:
Tag | Description |
-u, --uuid= | |
uuid of array to assemble. Devices which dont have this uuid are
excluded
|
-m, --super-minor= | |
Minor number of device that array was created for. Devices which
dont have this minor number are excluded. If you create an array as
/dev/md1, then all superblocks will contain the minor number 1, even if
the array is later assembled as /dev/md2.
Giving the literal word "dev" for
--super-minor will cause
mdadm to use the minor number of the md device that is being assembled.
e.g. when assembling
/dev/md0, --super-minor=dev will look for super blocks with a minor number of 0.
|
-N, --name= | |
Specify the name of the array to assemble. This must be the name
that was specified when creating the array. It must either match
the name stored in the superblock exactly, or it must match
with the current
homehost prefixed to the start of the given name.
|
-f, --force | |
Assemble the array even if some superblocks appear out-of-date
|
-R, --run | |
Attempt to start the array even if fewer drives were given than were
present last time the array was active. Normally if not all the
expected drives are found and
--scan is not used, then the array will be assembled but not started.
With
--run an attempt will be made to start it anyway.
|
--no-degraded | |
This is the reverse of
--run in that it inhibits the startup of array unless all expected drives
are present. This is only needed with
--scan, and can be used if the physical connections to devices are
not as reliable as you would like.
|
-a, --auto{=no,yes,md,mdp,part} | |
See this option under Create and Build options.
|
-b, --bitmap= | |
Specify the bitmap file that was given when the array was created. If
an array has an
internal bitmap, there is no need to specify this when assembling the array.
|
--backup-file= | |
If
--backup-file was used to grow the number of raid-devices in a RAID5, and the system
crashed during the critical section, then the same
--backup-file must be presented to
--assemble to allow possibly corrupted data to be restored.
|
-U, --update= | |
Update the superblock on each device while assembling the array. The
argument given to this flag can be one of
sparc2.2, summaries, uuid, name, homehost, resync, byteorder, devicesize, or
super-minor.
The
sparc2.2 option will adjust the superblock of an array what was created on a Sparc
machine running a patched 2.2 Linux kernel. This kernel got the
alignment of part of the superblock wrong. You can use the
--examine --sparc2.2 option to
mdadm to see what effect this would have.
The
super-minor option will update the
preferred minor field on each superblock to match the minor number of the array being
assembled.
This can be useful if
--examine reports a different "Preferred Minor" to
--detail. In some cases this update will be performed automatically
by the kernel driver. In particular the update happens automatically
at the first write to an array with redundancy (RAID level 1 or
greater) on a 2.6 (or later) kernel.
The
uuid option will change the uuid of the array. If a UUID is given with the
--uuid option that UUID will be used as a new UUID and will
NOT be used to help identify the devices in the array.
If no
--uuid is given, a random UUID is chosen.
The
name option will change the
name of the array as stored in the superblock. This is only supported for
version-1 superblocks.
The
homehost option will change the
homehost as recorded in the superblock. For version-0 superblocks, this is the
same as updating the UUID.
For version-1 superblocks, this involves updating the name.
The
resync option will cause the array to be marked
dirty meaning that any redundancy in the array (e.g. parity for raid5,
copies for raid1) may be incorrect. This will cause the raid system
to perform a "resync" pass to make sure that all redundant information
is correct.
The
byteorder option allows arrays to be moved between machines with different
byte-order.
When assembling such an array for the first time after a move, giving
--update=byteorder will cause
mdadm to expect superblocks to have their byteorder reversed, and will
correct that order before assembling the array. This is only valid
with original (Version 0.90) superblocks.
The
summaries option will correct the summaries in the superblock. That is the
counts of total, working, active, failed, and spare devices.
The
devicesize will rarely be of use. It applies to version 1.1 and 1.2 metadata
only (where the metadata is at the start of the device) and is only
useful when the component device has changed size (typically become
larger). The version 1 metadata records the amount of the device that
can be used to store data, so if a device in a version 1.1 or 1.2
array becomes larger, the metadata will still be visible, but the
extra space will not. In this case it might be useful to assemble the
array with
--update=devicesize. This will cause
mdadm to determine the maximum usable amount of space on each device and
update the relevant field in the metadata.
|
--auto-update-homehost | |
This flag is only meaningful with auto-assembly (see discussion below).
In that situation, if no suitable arrays are found for this homehost,
mdadm will rescan for any arrays at all and will assemble them and update the
homehost to match the current host.
|
For Manage mode:
Tag | Description |
-a, --add | |
add listed devices to a live array. When the array is in a degraded state
and you add a device, the device will be added as a spare device and
reconstruction on to the spare device will commence. Upon completion of
the reconstruction, the device will be transitioned to an active device.
If you add more devices than the arrays normal capacity of active devices,
then they are automatically added as hot spare devices. In order to
utilize the spare devices, use the Grow mode of mdadm to increase the number
of active devices in the array.
|
--re-add |
re-add a device that was recently removed from an array. This only applies
to devices that were part of an array built without a persistent superblock,
and for which a write intent bitmap exists. In this isolated case, the
kernel will treat this device as a previous member of the array even though
there is no superblock to tell it to do so. For all add operations involving
arrays with persistent superblocks, use the --add command above and the
kernel will automatically determine whether a full resync or partial resync
is needed based upon the superblock state and the write intent bitmap
state (if it exists).
|
-r, --remove | |
remove listed devices. They must not be active. i.e. they should
be failed or spare devices. As well as the name of a device file
(e.g.
/dev/sda1) the words
failed and
detached can be given to
--remove. The first causes all failed device to be removed. The second causes
any device which is no longer connected to the system (i.e an open
returns
ENXIO) to be removed. This will only succeed for devices that are spares or
have already been marked as failed.
|
-f, --fail | |
mark listed devices as faulty.
As well as the name of a device file, the word
detached can be given. This will cause any device that has been detached from
the system to be marked as failed. It can then be removed.
|
--set-faulty | |
same as
--fail.
|
--write-mostly | |
Subsequent devices that are added or re-added will have the write-mostly
flag set. This is only valid for RAID1 and means that the md driver
will avoid reading from these devices if possible.
|
--readwrite | |
Subsequent devices that are added or re-added will have the write-mostly
flag cleared.
|
Each of these options require that the first device listed is the array
to be acted upon, and the remainder are component devices to be added,
removed, or marked as faulty. Several different operations can be
specified for different devices, e.g.
mdadm /dev/md0 --add /dev/sda1 --fail /dev/sdb1 --remove /dev/sdb1
Each operation applies to all devices listed until the next
operation.
If an array is using a write-intent bitmap, then devices which have
been removed can be re-added in a way that avoids a full
reconstruction but instead just updates the blocks that have changed
since the device was removed. For arrays with persistent metadata
(superblocks) this is done automatically. For arrays created with
--build mdadm needs to be told that this device was removed recently by using
--re-add instead of
--add command (see above).
Devices can only be removed from an array if they are not in active
use, i.e. they must be spares or failed devices. To remove an active
device, it must first be marked as
faulty.
For Misc mode:
Tag | Description |
-Q, --query | |
Examine a device to see
(1) if it is an md device and (2) if it is a component of an md
array.
Information about what is discovered is presented.
|
-D, --detail | |
Print detail of one or more md devices.
|
-Y, --export | |
When used with
--detail or
--examine, output will be formatted as
key=value pairs for easy import into the environment.
|
-E, --examine | |
Print content of md superblock on device(s).
|
--sparc2.2 | |
If an array was created on a 2.2 Linux kernel patched with RAID
support, the superblock will have been created incorrectly, or at
least incompatibly with 2.4 and later kernels. Using the
--sparc2.2 flag with
--examine will fix the superblock before displaying it. If this appears to do
the right thing, then the array can be successfully assembled using
--assemble --update=sparc2.2.
|
-X, --examine-bitmap | |
Report information about a bitmap file.
The argument is either an external bitmap file or an array component
in case of an internal bitmap.
|
-R, --run | |
start a partially built array.
|
-S, --stop | |
deactivate array, releasing all resources.
|
-o, --readonly | |
mark array as readonly.
|
-w, --readwrite | |
mark array as readwrite.
|
--zero-superblock | |
If the device contains a valid md superblock, the block is
overwritten with zeros. With
--force the block where the superblock would be is overwritten even if it
doesnt appear to be valid.
|
-t, --test | |
When used with
--detail, the exit status of
mdadm is set to reflect the status of the device.
|
-W, --wait | |
For each md device given, wait for any resync, recovery, or reshape
activity to finish before returning.
mdadm will return with success if it actually waited for every device
listed, otherwise it will return failure.
|
For Incremental Assembly mode:
Tag | Description |
--rebuild-map, -r | |
Rebuild the map file
(/var/run/mdadm/map) that
mdadm uses to help track which arrays are currently being assembled.
|
--run, -R | |
Run any array assembled as soon as a minimal number of devices are
available, rather than waiting until all expected devices are present.
|
--scan, -s | |
Only meaningful with
-R this will scan the
map file for arrays that are being incrementally assembled and will try to
start any that are not already started. If any such array is listed
in
mdadm.conf as requiring an external bitmap, that bitmap will be attached first.
|
For Monitor mode:
Tag | Description |
-m, --mail | |
Give a mail address to send alerts to.
|
-p, --program, --alert | |
Give a program to be run whenever an event is detected.
|
-y, --syslog | |
Cause all events to be reported through syslog. The messages have
facility of daemon and varying priorities.
|
-d, --delay | |
Give a delay in seconds.
mdadm polls the md arrays and then waits this many seconds before polling
again. The default is 60 seconds.
|
-f, --daemonise | |
Tell
mdadm to run as a background daemon if it decides to monitor anything. This
causes it to fork and run in the child, and to disconnect form the
terminal. The process id of the child is written to stdout.
This is useful with
--scan which will only continue monitoring if a mail address or alert program
is found in the config file.
|
-i, --pid-file | |
When
mdadm is running in daemon mode, write the pid of the daemon process to
the specified file, instead of printing it on standard output.
|
-1, --oneshot | |
Check arrays only once. This will generate
NewArray events and more significantly
DegradedArray and
SparesMissing events. Running
mdadm --monitor --scan -1
from a cron script will ensure regular notification of any degraded arrays.
|
-t, --test | |
Generate a
TestMessage alert for every array found at startup. This alert gets mailed and
passed to the alert program. This can be used for testing that alert
message do get through successfully.
|
ASSEMBLE MODE
Tag | Description |
Usage:
mdadm --assemble md-device options-and-component-devices...
Usage:
mdadm --assemble --scan md-devices-and-options...
Usage:
mdadm --assemble --scan options...
|
This usage assembles one or more raid arrays from pre-existing components.
For each array, mdadm needs to know the md device, the identity of the
array, and a number of component-devices. These can be found in a number of ways.
In the first usage example (without the
--scan) the first device given is the md device.
In the second usage example, all devices listed are treated as md
devices and assembly is attempted.
In the third (where no devices are listed) all md devices that are
listed in the configuration file are assembled.
If precisely one device is listed, but
--scan is not given, then
mdadm acts as though
--scan was given and identity information is extracted from the configuration file.
The identity can be given with the
--uuid option, with the
--super-minor option, will be taken from the md-device record in the config file, or
will be taken from the super block of the first component-device
listed on the command line.
Devices can be given on the
--assemble command line or in the config file. Only devices which have an md
superblock which contains the right identity will be considered for
any array.
The config file is only used if explicitly named with
--config or requested with (a possibly implicit)
--scan. In the later case,
/etc/mdadm.conf is used.
If
--scan is not given, then the config file will only be used to find the
identity of md arrays.
Normally the array will be started after it is assembled. However if
--scan is not given and insufficient drives were listed to start a complete
(non-degraded) array, then the array is not started (to guard against
usage errors). To insist that the array be started in this case (as
may work for RAID1, 4, 5, 6, or 10), give the
--run flag.
If the md device does not exist, then it will be created providing the
intent is clear. i.e. the name must be in a standard form, or the
--auto option must be given to clarify how and whether the device should be
created.
This can be useful for handling partitioned devices (which dont have
a stable device number it can change after a reboot) and when using
"udev" to manage your
/dev tree (udev cannot handle md devices because of the unusual device
initialisation conventions).
If the option to "auto" is "mdp" or "part" or (on the command line
only) "p", then mdadm will create a partitionable array, using the
first free one that is not in use and does not already have an entry
in /dev (apart from numeric /dev/md* entries).
If the option to "auto" is "yes" or "md" or (on the command line)
nothing, then mdadm will create a traditional, non-partitionable md
array.
It is expected that the "auto" functionality will be used to create
device entries with meaningful names such as "/dev/md/home" or
"/dev/md/root", rather than names based on the numerical array number.
When using option "auto" to create a partitionable array, the device
files for the first 4 partitions are also created. If a different
number is required it can be simply appended to the auto option.
e.g. "auto=part8". Partition names are created by appending a digit
string to the device name, with an intervening "p" if the device name
ends with a digit.
The
--auto option is also available in Build and Create modes. As those modes do
not use a config file, the "auto=" config option does not apply to
these modes.
Auto Assembly
When
--assemble is used with
--scan and no devices are listed,
mdadm will first attempt to assemble all the arrays listed in the config
file.
If a
homehost has been specified (either in the config file or on the command line),
mdadm will look further for possible arrays and will try to assemble
anything that it finds which is tagged as belonging to the given
homehost. This is the only situation where
mdadm will assemble arrays without being given specific device name or
identity information for the array.
If
mdadm finds a consistent set of devices that look like they should comprise
an array, and if the superblock is tagged as belonging to the given
home host, it will automatically choose a device name and try to
assemble the array. If the array uses version-0.90 metadata, then the
minor number as recorded in the superblock is used to create a name in
/dev/md/ so for example
/dev/md/3. If the array uses version-1 metadata, then the
name from the superblock is used to similarly create a name in
/dev/md (the name will have any host prefix stripped first).
If
mdadm cannot find any array for the given host at all, and if
--auto-update-homehost is given, then
mdadm will search again for any array (not just an array created for this
host) and will assemble each assuming
--update=homehost. This will change the host tag in the superblock so that on the next run,
these arrays will be found without the second pass. The intention of
this feature is to support transitioning a set of md arrays to using
homehost tagging.
The reason for requiring arrays to be tagged with the homehost for
auto assembly is to guard against problems that can arise when moving
devices from one host to another.
BUILD MODE
Tag | Description |
Usage:
mdadm --build md-device --chunk=X --level=Y --raid-devices=Z devices
|
This usage is similar to
--create. The difference is that it creates an array without a superblock. With
these arrays there is no difference between initially creating the array and
subsequently assembling the array, except that hopefully there is useful
data there in the second case.
The level may raid0, linear, multipath, or faulty, or one of their
synonyms. All devices must be listed and the array will be started
once complete.
CREATE MODE
Tag | Description |
Usage:
mdadm --create md-device --chunk=X --level=Y | |
--raid-devices=Z devices
|
This usage will initialise a new md array, associate some devices with
it, and activate the array.
If the
--auto option is given (as described in more detail in the section on
Assemble mode), then the md device will be created with a suitable
device number if necessary.
As devices are added, they are checked to see if they contain raid
superblocks or filesystems. They are also checked to see if the variance in
device size exceeds 1%.
If any discrepancy is found, the array will not automatically be run, though
the presence of a
--run can override this caution.
To create a "degraded" array in which some devices are missing, simply
give the word "missing"
in place of a device name. This will cause
mdadm to leave the corresponding slot in the array empty.
For a RAID4 or RAID5 array at most one slot can be
"missing"; for a RAID6 array at most two slots.
For a RAID1 array, only one real device needs to be given. All of the
others can be
"missing".
When creating a RAID5 array,
mdadm will automatically create a degraded array with an extra spare drive.
This is because building the spare into a degraded array is in general faster than resyncing
the parity on a non-degraded, but not clean, array. This feature can
be overridden with the
--force option.
When creating an array with version-1 metadata a name for the array is
required.
If this is not given with the
--name option,
mdadm will choose a name based on the last component of the name of the
device being created. So if
/dev/md3 is being created, then the name
3 will be chosen.
If
/dev/md/home is being created, then the name
home will be used.
When creating a partition based array, using
mdadm with version-1.x metadata, the partition type should be set to
0xDA (non fs-data). This type selection allows for greater precision since
using any other [RAID auto-detect (0xFD) or a GNU/Linux partition (0x83)],
might create problems in the event of array recovery through a live cdrom.
A new array will normally get a randomly assigned 128bit UUID which is
very likely to be unique. If you have a specific need, you can choose
a UUID for the array by giving the
--uuid= option. Be warned that creating two arrays with the same UUID is a
recipe for disaster. Also, using
--uuid= when creating a v0.90 array will silently override any
--homehost= setting.
The General Management options that are valid with
--create are:
Tag | Description |
--run |
insist on running the array even if some devices look like they might
be in use.
|
--readonly | |
start the array readonly not supported yet.
|
MANAGE MODE
Tag | Description |
Usage:
mdadm device options... devices... |
This usage will allow individual devices in an array to be failed,
removed or added. It is possible to perform multiple operations with
on command. For example:
mdadm /dev/md0 -f /dev/hda1 -r /dev/hda1 -a /dev/hda1
will firstly mark
/dev/hda1 as faulty in
/dev/md0 and will then remove it from the array and finally add it back
in as a spare. However only one md array can be affected by a single
command.
MISC MODE
Tag | Description |
Usage:
mdadm options ... devices ... |
MISC mode includes a number of distinct operations that
operate on distinct devices. The operations are:
Tag | Description |
--query |
The device is examined to see if it is
(1) an active md array, or
(2) a component of an md array.
The information discovered is reported.
|
--detail | |
The device should be an active md device.
mdadm will display a detailed description of the array.
--brief or
--scan will cause the output to be less detailed and the format to be
suitable for inclusion in
/etc/mdadm.conf. The exit status of
mdadm will normally be 0 unless
mdadm failed to get useful information about the device(s); however, if the
--test option is given, then the exit status will be:
Tag | Description |
0
|
The array is functioning normally.
|
1
|
The array has at least one failed device.
|
2
|
The array has multiple failed devices such that it is unusable.
|
4
|
There was an error while trying to get information about the device.
|
|
--examine | |
The device should be a component of an md array.
mdadm will read the md superblock of the device and display the contents.
If
--brief or
--scan is given, then multiple devices that are components of the one array
are grouped together and reported in a single entry suitable
for inclusion in
/etc/mdadm.conf.
Having
--scan without listing any devices will cause all devices listed in the
config file to be examined.
|
--stop |
The devices should be active md arrays which will be deactivated, as
long as they are not currently in use.
|
--run |
This will fully activate a partially assembled md array.
|
--readonly | |
This will mark an active array as read-only, providing that it is
not currently being used.
|
--readwrite | |
This will change a
readonly array back to being read/write.
|
--scan |
For all operations except
--examine, --scan will cause the operation to be applied to all arrays listed in
/proc/mdstat. For
--examine, --scan causes all devices listed in the config file to be examined.
|
MONITOR MODE
Tag | Description |
Usage:
mdadm --monitor options... devices...
|
This usage causes
mdadm to periodically poll a number of md arrays and to report on any events
noticed.
mdadm will never exit once it decides that there are arrays to be checked,
so it should normally be run in the background.
As well as reporting events,
mdadm may move a spare drive from one array to another if they are in the
same
spare-group and if the destination array has a failed drive but no spares.
If any devices are listed on the command line,
mdadm will only monitor those devices. Otherwise all arrays listed in the
configuration file will be monitored. Further, if
--scan is given, then any other md devices that appear in
/proc/mdstat will also be monitored.
The result of monitoring the arrays is the generation of events.
These events are passed to a separate program (if specified) and may
be mailed to a given E-mail address.
When passing events to a program, the program is run once for each event,
and is given 2 or 3 command-line arguments: the first is the
name of the event (see below), the second is the name of the
md device which is affected, and the third is the name of a related
device if relevant (such as a component device that has failed).
If
--scan is given, then a program or an E-mail address must be specified on the
command line or in the config file. If neither are available, then
mdadm will not monitor anything.
Without
--scan, mdadm will continue monitoring as long as something was found to monitor. If
no program or email is given, then each event is reported to
stdout.
The different events are:
Tag | Description |
DeviceDisappeared | |
An md array which previously was configured appears to no longer be
configured. (syslog priority: Critical)
If
mdadm was told to monitor an array which is RAID0 or Linear, then it will
report
DeviceDisappeared with the extra information
Wrong-Level. This is because RAID0 and Linear do not support the device-failed,
hot-spare and resync operations which are monitored.
|
RebuildStarted | |
An md array started reconstruction. (syslog priority: Warning)
|
RebuildNN | |
Where
NN is 20, 40, 60, or 80, this indicates that rebuild has passed that many
percentage of the total. (syslog priority: Warning)
|
RebuildFinished | |
An md array that was rebuilding, isnt any more, either because it
finished normally or was aborted. (syslog priority: Warning)
|
Fail |
An active component device of an array has been marked as
faulty. (syslog priority: Critical)
|
FailSpare | |
A spare component device which was being rebuilt to replace a faulty
device has failed. (syslog priority: Critical)
|
SpareActive | |
A spare component device which was being rebuilt to replace a faulty
device has been successfully rebuilt and has been made active.
(syslog priority: Info)
|
NewArray | |
A new md array has been detected in the
/proc/mdstat file. (syslog priority: Info)
|
DegradedArray | |
A newly noticed array appears to be degraded. This message is not
generated when
mdadm notices a drive failure which causes degradation, but only when
mdadm notices that an array is degraded when it first sees the array.
(syslog priority: Critical)
|
MoveSpare | |
A spare drive has been moved from one array in a
spare-group to another to allow a failed drive to be replaced.
(syslog priority: Info)
|
SparesMissing | |
If
mdadm has been told, via the config file, that an array should have a certain
number of spare devices, and
mdadm detects that it has fewer than this number when it first sees the
array, it will report a
SparesMissing message.
(syslog priority: Warning)
|
TestMessage | |
An array was found at startup, and the
--test flag was given.
(syslog priority: Info)
|
Only
Fail, FailSpare, DegradedArray, SparesMissing and
TestMessage cause Email to be sent. All events cause the program to be run.
The program is run with two or three arguments: the event
name, the array device and possibly a second device.
Each event has an associated array device (e.g.
/dev/md1) and possibly a second device. For
Fail, FailSpare, and
SpareActive the second device is the relevant component device.
For
MoveSpare the second device is the array that the spare was moved from.
For
mdadm to move spares from one array to another, the different arrays need to
be labeled with the same
spare-group in the configuration file. The
spare-group name can be any string; it is only necessary that different spare
groups use different names.
When
mdadm detects that an array in a spare group has fewer active
devices than necessary for the complete array, and has no spare
devices, it will look for another array in the same spare group that
has a full complement of working drive and a spare. It will then
attempt to remove the spare from the second drive and add it to the
first.
If the removal succeeds but the adding fails, then it is added back to
the original array.
GROW MODE
The GROW mode is used for changing the size or shape of an active
array.
For this to work, the kernel must support the necessary change.
Various types of growth are being added during 2.6 development,
including restructuring a raid5 array to have more active devices.
Currently the only support available is to
Tag | Description |
o
|
change the "size" attribute
for RAID1, RAID5 and RAID6.
|
o
|
increase the "raid-devices" attribute of RAID1, RAID5, and RAID6.
|
o
|
add a write-intent bitmap to any array which supports these bitmaps, or
remove a write-intent bitmap from such an array.
|
SIZE CHANGES
Normally when an array is built the "size" it taken from the smallest
of the drives. If all the small drives in an arrays are, one at a
time, removed and replaced with larger drives, then you could have an
array of large drives with only a small amount used. In this
situation, changing the "size" with "GROW" mode will allow the extra
space to start being used. If the size is increased in this way, a
"resync" process will start to make sure the new parts of the array
are synchronised.
Note that when an array changes size, any filesystem that may be
stored in the array will not automatically grow to use the space. The
filesystem will need to be explicitly told to use the extra space.
RAID-DEVICES CHANGES
A RAID1 array can work with any number of devices from 1 upwards
(though 1 is not very useful). There may be times when you want to
increase or decrease the number of active devices. Note that this is
different than hot-add or hot-remove which changes the number of
inactive devices.
When reducing the number of devices in a RAID1 array, the slots which
are to be removed from the array must already be vacant. That is, the
devices which were in those slots must be failed and removed.
When the number of devices is increased, any hot spares that are
present will be activated immediately.
Increasing the number of active devices in a RAID5 is much more
work. Every block in the array will need to be moved to a new location.
From 2.6.17, the Linux Kernel is able to do this safely, including
restarting an interrupted "reshape".
When relocating the first few stripes on a raid5, it is not possible
to keep the data on disk completely consistent and crash-proof. To
provide the required safety, mdadm disables writes to the array while
this "critical section" is reshaped, and makes a backup of the data
that is in that section. This backup is normally stored in any spare
devices that the array has, however it can also be stored in a
separate file specified with the
--backup-file option. If this option is used, and the system does crash during the
critical period, the same file must be passed to
--assemble to restore the backup and reassemble the array.
BITMAP CHANGES
A write-intent bitmap can be added to, or removed from, an active
array. Either internal bitmaps or an external bitmap stored in a file
can be added. In the case of internal bitmaps, there is one copy of the
bitmap per device (since you never know what device might fail, you need
a copy on every device). The bitmap is stored between the array data
and the superblock, which limits the total number of bits available.
For a bitmap in an external file, only one copy is needed, but this
assumes that the bitmap file is not on an array device or else failure
of that device would take the only copy of the bitmap with it. For
this reason, the fact that the kernel will deadlock if you attempt
to use a file that resides on the array it is the bitmap for is
considered a safety feature.
INCREMENTAL MODE
Tag | Description |
Usage:
mdadm --incremental [--run] [--quiet] component-device
Usage:
mdadm --incremental --rebuild
Usage:
mdadm --incremental --run --scan
|
This mode is designed to be used in conjunction with a device
discovery system. As devices are found in a system, they can be
passed to
mdadm --incremental to be conditionally added to an appropriate array.
mdadm performs a number of tests to determine if the device is part of an
array, and which array it should be part of. If an appropriate array
is found, or can be created,
mdadm adds the device to the array and conditionally starts the array.
Note that
mdadm will only add devices to an array which were previously working
(active or spare) parts of that array. It does not currently support
automatic inclusion of a new drive as a spare in some array.
mdadm --incremental requires a bug-fix in all kernels through 2.6.19.
Hopefully, this will be fixed in 2.6.20; alternately, apply the patch
which is included with the mdadm source distribution. If
mdadm detects that this bug is present, it will abort any attempt to use
--incremental.
The tests that
mdadm makes are as follow:
Tag | Description |
+
|
Is the device permitted by
mdadm.conf? That is, is it listed in a
DEVICES line in that file. If
DEVICES is absent then the default it to allow any device. Similar if
DEVICES contains the special word
partitions then any device is allowed. Otherwise the device name given to
mdadm must match one of the names or patterns in a
DEVICES line.
|
+
|
Does the device have a valid md superblock. If a specific metadata
version is request with
--metadata or
-e then only that style of metadata is accepted, otherwise
mdadm finds any known version of metadata. If no
md metadata is found, the device is rejected.
|
+
|
Does the metadata match an expected array?
The metadata can match in two ways. Either there is an array listed
in
mdadm.conf which identifies the array (either by UUID, by name, by device list,
or by minor-number), or the array was created with a
homehost specified and that
homehost matches the one in
mdadm.conf or on the command line.
If
mdadm is not able to positively identify the array as belonging to the
current host, the device will be rejected.
|
+
|
mdadm keeps a list of arrays that it has partially assembled in
/var/run/mdadm/map (or
/var/run/mdadm.map if the directory doesnt exist). If no array exists which matches
the metadata on the new device,
mdadm must choose a device name and unit number. It does this based on any
name given in
mdadm.conf or any name information stored in the metadata. If this name
suggests a unit number, that number will be used, otherwise a free
unit number will be chosen. Normally
mdadm will prefer to create a partitionable array, however if the
CREATE line in
mdadm.conf suggests that a non-partitionable array is preferred, that will be
honoured.
|
+
|
Once an appropriate array is found or created and the device is added,
mdadm must decide if the array is ready to be started. It will
normally compare the number of available (non-spare) devices to the
number of devices that the metadata suggests need to be active. If
there are at least that many, the array will be started. This means
that if any devices are missing the array will not be restarted.
As an alternative,
--run may be passed to
mdadm in which case the array will be run as soon as there are enough
devices present for the data to be accessible. For a raid1, that
means one device will start the array. For a clean raid5, the array
will be started as soon as all but one drive is present.
Note that neither of these approaches is really ideal. If it can
be known that all device discovery has completed, then
mdadm -IRs
can be run which will try to start all arrays that are being
incrementally assembled. They are started in "read-auto" mode in
which they are read-only until the first write request. This means
that no metadata updates are made and no attempt at resync or recovery
happens. Further devices that are found before the first write can
still be added safely.
|
EXAMPLES
mdadm --query /dev/name-of-device
This will find out if a given device is a raid array, or is part of
one, and will provide brief information about the device.
mdadm --assemble --scan
This will assemble and start all arrays listed in the standard config
file. This command will typically go in a system startup file.
mdadm --stop --scan
This will shut down all arrays that can be shut down (i.e. are not
currently in use). This will typically go in a system shutdown script.
mdadm --follow --scan --delay=120
If (and only if) there is an Email address or program given in the
standard config file, then
monitor the status of all arrays listed in that file by
polling them ever 2 minutes.
mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/hd[ac]1
Create /dev/md0 as a RAID1 array consisting of /dev/hda1 and /dev/hdc1.
echo DEVICE /dev/hd*[0-9] /dev/sd*[0-9] > mdadm.conf
mdadm --detail --scan >> mdadm.conf
This will create a prototype config file that describes currently
active arrays that are known to be made from partitions of IDE or SCSI drives.
This file should be reviewed before being used as it may
contain unwanted detail.
echo DEVICE /dev/hd[a-z] /dev/sd*[a-z] > mdadm.conf
mdadm --examine --scan --config=mdadm.conf >> mdadm.conf
This will find arrays which could be assembled from existing IDE and
SCSI whole drives (not partitions), and store the information in the
format of a config file.
This file is very likely to contain unwanted detail, particularly
the
devices= entries. It should be reviewed and edited before being used as an
actual config file.
mdadm --examine --brief --scan --config=partitions
mdadm -Ebsc partitions
Create a list of devices by reading
/proc/partitions, scan these for RAID superblocks, and printout a brief listing of all
that were found.
mdadm -Ac partitions -m 0 /dev/md0
Scan all partitions and devices listed in
/proc/partitions and assemble
/dev/md0 out of all such devices with a RAID superblock with a minor number of 0.
mdadm --monitor --scan --daemonise > /var/run/mdadm
If config file contains a mail address or alert program, run mdadm in
the background in monitor mode monitoring all md devices. Also write
pid of mdadm daemon to
/var/run/mdadm.
mdadm -Iq /dev/somedevice
Try to incorporate newly discovered device into some array as
appropriate.
mdadm --incremental --rebuild --run --scan
Rebuild the array map from any current arrays, and then start any that
can be started.
mdadm /dev/md4 --fail detached --remove detached
Any devices which are components of /dev/md4 will be marked as faulty
and then remove from the array.
mdadm --create --help
Provide help about the Create mode.
mdadm --config --help
Provide help about the format of the config file.
mdadm --help
Provide general help.
FILES
/proc/mdstat
If youre using the
/proc filesystem,
/proc/mdstat lists all active md devices with information about them.
mdadm uses this to find arrays when
--scan is given in Misc mode, and to monitor array reconstruction
on Monitor mode.
/etc/mdadm.conf
The config file lists which devices may be scanned to see if
they contain MD super block, and gives identifying information
(e.g. UUID) about known MD arrays. See
mdadm.conf(5)
for more details.
/var/run/mdadm/map
When
--incremental mode is used, this file gets a list of arrays currently being created.
If
/var/run/mdadm does not exist as a directory, then
/var/run/mdadm.map is used instead.
DEVICE NAMES
While entries in the /dev directory can have any format you like,
mdadm has an understanding of standard formats which it uses to guide its
behaviour when creating device files via the
--auto option.
The standard names for non-partitioned arrays (the only sort of md
array available in 2.4 and earlier) are either of
Tag | Description |
|
/dev/mdNN
/dev/md/NN
|
where NN is a number.
The standard names for partitionable arrays (as available from 2.6
onwards) are either of
|
|
/dev/md/dNN
/dev/md_dNN
|
Partition numbers should be indicated by added "pMM" to these, thus "/dev/md/d1p2".
NOTE
mdadm was previously known as
mdctl.
mdadm is completely separate from the
raidtools package, and does not use the
/etc/raidtab configuration file at all.
SEE ALSO
RAID, see:
Related man pages:
mdadm.conf(5),
md(4).
raidtab(5),
raid0run(8),
raidstop(8),
mkraid(8).
|