Age | Commit message (Collapse) | Author | Files | Lines |
|
Multi-mount protection is feature that allows mke2fs, e2fsck, and
others to detect if the filesystem is mounted on a remote node (on
SAN disks) and avoid corrupting the filesystem. For e2fsprogs this
means that it checks the MMP block to see if the filesystem is in use,
and marks the filesystem busy while e2fsck is running on the system.
This is useful on SAN disks that are shared between high-availability
servers, or accessible by multiple nodes that aren't in HA pairs. MMP
isn't intended to serve as a primary HA exclusion mechanism, but as a
failsafe to protect against user, software, or hardware errors.
There is no requirement that e2fsck updates the MMP block at regular
intervals, but e2fsck does this occasionally to provide useful
information to the sysadmin in case of a detected conflict.
For the kernel (since Linux 3.0) MMP adds a "heartbeat" mechanism to
periodically write to disk (every few seconds by default) to notify
other nodes that the filesystem is still in use and unsafe to modify.
Originally-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Also cleaned up ext2_fs.h, and improved the byte swapping code so the
extra fields in the large inode are properly byte swapped.
Addresses-Debian-Bug: #641838
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
The set_fields commands (set_super_value, set_inode_field,
set_block_group) now handle fields which store in split fields on
ext4's on-disk format. For example, the superblock fields
s_blocks_count and s_blocks_count_hi.
The user can either set the low or high part of the field via
"blocks_count_lo" or "blocks_count_hi", or both parts can be set via
"blocks_count".
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Reserve EXT4_FEATURE_RO_COMPAT_METADATA_CSUM and
EXT2_FEATURE_COMPAT_EXCLUDE_BITMAP. Also reserve fields in the
superblock and the inode for the checksums. In the block group
descriptor, reserve the exclude bitmap field for the snapshot feature,
and checksums for the inode and block allocation bitmaps.
With this commit, the metadata checksum and exclude bitmap features
should have reserved all of the fields they need in ext4's on-disk
format.
This commit also fixes an a missing byte swap for s_overhead_blocks.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Darrick J. Wong <djwong@us.ibm.com>
Cc: Amir Goldstein <amir73il@gmail.com>
|
|
|
|
The dump program relies on fs->frag_size and the
EXT2_FRAGS_PER_BLOCK() macro. Kind of silly for it to do so, but it's
part of the kludgy way the dump program (which was originally written
for the BSD FFS was ported over to support ext2/3.) Given how it
makes assumptions about the ext2/3/4 file system being similar to the
BSD FFS, it's a bit of a miracle it works for ext4 --- or at least
appears to work...
Addresses-Debian-Bug: #636418
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This is needed to support online resizing for > 32-bit file systems
Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Conflicts:
lib/ext2fs/bitmaps.c
lib/ext2fs/rw_bitmaps.c
misc/dumpe2fs.c
|
|
Change the EXT2_MAX_BLOCKS_PER_GROUP so that it takes the cluster size
into account. This way we can open bigalloc file systems without
ext2fs_open() thinking that they are corrupt.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Conflicts:
lib/e2p/ls.c
|
|
It turns out that it's very hard to calculate overheads in the face of
clustered allocation (bigalloc). This is because multiple metadata
blocks from different block groups can end up in the same allocation
cluster. Calculating the exact overhead requires O(all block bitmaps)
in memory, or O(number of block groups**2) in time. So we will
calculate this at mkfs time and stash it in the superblock.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Conflicts:
lib/ext2fs/initialize.c
|
|
This adds the superblock fields needed so that dumpe2fs works and the
code points and renames the superblock fields from describing
fragments to clusters.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This patch adds support for detecting the new 'quota' feature in ext4.
The patch reserves code points for usr and group quota inodes and also
for the feature flag EXT4_FEATURE_RO_COMPAT_QUOTA.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Conflicts:
configure
configure.in
lib/ext2fs/ext2fs.h
misc/mke2fs.c
|
|
Add support for 2.6.35's new default mount options which can be
specified in the superblock.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Conflicts:
resize/extent.c
|
|
Add superblock fields which track where and when the first and most
recent file system errors occured. These fields are displayed by
dumpe2fs and cleared by e2fsck.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Conflicts:
e2fsck/journal.c
e2fsck/pass1.c
e2fsck/pass2.c
misc/mke2fs.c
|
|
The documentation is not (as of this writing) fully complete, but
there is some documentation here:
http://sourceforge.net/apps/mediawiki/next3/index.php?title=Code_documentation
http://sourceforge.net/apps/mediawiki/next3/index.php?title=On-disk_format
http://sourceforge.net/projects/next3/files/Next3_Snapshots.pdf/download
... which will hopefully be updated soon to be fully up to date with
these assignments and more details about how things work.
For now, the assignments should avoid collisions with other new work
that people might want to do on ext3/4.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Reserve the EXT4_FEATURE_INCOMPAT_DIRDATA feature flag for adding
extra file data in ext2_dir_entry_2 entries.
This changes the on-disk layout in the following way.
Firstly, the ext2_dir_entry_2 file_type field now has a mask: that
limits the "filetype" information to the low 4 bits of this field.
Since these values are sequentially assigned, this allows for up to 7
more filetypes to be assigned. When reading the "filetype" field, the
high 4 bits should be masked off when converting to DT_* filetypes for
userspace.
The high 4 bits of "filetype" are used as a bitmask to register up to
4 different "extended" directory entry fields. Extended data fields
are packed without alignment into the directory entry after the "name"
field in order of increasing bitmask value, for each field where bit
is set. In order to avoid the need to "understand" each of the
extended fields, the first byte of each extended data field holds the
size of that data field (including the size itself), so they can be
skipped if not understood. For fields that change the semantics of
the filesystem it is expected that a separate ROCOMPAT or INCOMPAT
field is registered.
There is a single dirent data type defined currently, for Lustre:
which holds a 128-bit file identifier. It is expected that if there
are 64-bit inode values that this will be assigned the 0x20 value.
Should a need ever arise to use all 4 of the extended dirent data
fields, it would be possible to keep the last bit (0x80) for use as a
multiplexor that stores a 1-byte aggregate data size, then a series of
"<u8_size><u8_type><data>" records in the last extended data record.
It is not expected that this will actually be needed in the lifetime
of ext4.
Signed-off-by: Andreas Dilger <adilger@sun.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Reserve the EXT4_INCOMPAT_EA_INODE feature flag for use with
large extended attributes that are stored in a separate inode.
This changes the on-disk format in several ways:
First, replace the e_value_block field with e_value_inum, so that
an xattr entry can reference an external inode. This field is
currently unused, as all of the entries live in the same block.
struct ext2_ext_attr_entry {
__u8 e_name_len; /* length of name */
__u8 e_name_index; /* attribute name index */
__le16 e_value_offs; /* offset in disk block of value */
> __le32 e_value_inum; /* inode in which the value is stored */
__le32 e_value_size; /* size of attribute value */
__le32 e_hash; /* hash value of name and value */
char e_name[0]; /* attribute name */
}
Second, add a flag to the inode that indicates it is using a large
(external) extended attribute. This is needed so that when unlinking
an inode the xattrs will be scanned to unlink the xattr inodes
referenced by the main inode.
Third, for inodes that have a number of xattrs that are larger than
a single block, but not large enough to justify an external inode
(less than 64kB total xattr size, due to e_value_offs limitation)
the ext2_ext_attr_header->h_blocks field can grow beyond a single
block to represent a contiguous allocation of blocks for the xattr.
Signed-off-by: Andreas Dilger <adilger@sun.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
|
|
This is the userspace side of Jiaying's EOFBLOCKS patch. With
Aneesh's patches for .33, Jiaying's patch, and this one, xfstests
013/fsstress (even with direct IO enabled) has held up through many
runs.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This provides support for 48-bit file acl blocks.
Signed-off-by: Valerie Aurora Henson <vaurora@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Signed-off-by: Valerie Aurora Henson <vaurora@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This field tracks the lifetime amount of writes to the filesystem. It
will be updated by the kernel as well as by e2fsprogs programs which
write to the filesystem.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
First try the ext3 ioctl, but if we get an error, try using the ext4
ioctl.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Add superblock definition, and dumpe2fs and debugfs support.
Signed-off-by: Jose R. Santos <jrs@us.ibm.com>
Signed-off-by: Valerie Clement <valerie.clement@bull.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
- Add support for computing CRC-16 value.
- Add call to check/verify/set csum on block_groups.
- Add a test program to verify csum operations.
Signed-off-by: Jose R. Santos <jrs@us.ibm.com>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This patch includes the changes required to e2fsck to understand the
nlink count changes made in the kernel.
In e2fsck pass 4, when we fetch the actual link count, if it is
exceeds 65,000 we set the link count to 1. We silently fix the
situation where the nlink count of the directory is 1, and there are
fewer than 65,000 subdirectories, since since that can happen
naturally.
Patch originally from CFS, significantly rewritten by Theodore Ts'o.
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
|
|
No application will ever use the ORPHAN_FS flag, since it only shows
up in kernel memory, but it's been pointed out it was first used in
ext3, and so it should be renamed for accuracy.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Conflicts:
configure
lib/ext2fs/ext2_fs.h
misc/e2image.c
|
|
The test_fs flag is an "ok to be used with test kernel code" flag. It
makes it easier for us to determine whether a filesystem should be
mounted using ext4 or not.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This is needed for all non-Linux/Hurd/Masix systems...
Addresses-Sourceforge-Bug: #1863819
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Conflicts:
configure
debian/rules
e2fsck/swapfs.c
lib/ext2fs/ext2_fs.h
|
|
The previous fix didn't quite work, but this one should!
Addresses-Sourceforge-Bug: #1861633
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
inode_uid() and inode_gid() weren't getting defined on systems that
were not Linux, Hurd, or Masix.
Addresses-Sourceforge-Bug: #1859778
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Addresses-Debian-Bug: #437720
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Merge commit 'c2d4300b8a4a13d8a78b86c386f76259f23feec2' into next
|
|
Signed-off-by: Jose R. Santos <jrs@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
--
lib/ext2fs/ext2_fs.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
|
|
This patch remove masix support from lib/ext2fs.
Signed-off-by: Coly Li <coyli@suse.de>
|
|
Add macros to support variable-length group descriptors for ext4.
Signed-off-by: Valerie Clement <valerie.clement@bull.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
There have been reported instances of a filesystem having been mounted
at 2 places at the same time causing a lot of damage to the
filesystem. This patch reserves superblock fields and an INCOMPAT flag
for adding multiple mount protection(MMP) support within the ext4
filesystem itself. The superblock will have a block number
(s_mmp_block) which will hold a MMP structure which has a sequence
number which will be periodically updated every 5 seconds by a mounted
filesystem. Whenever a filesystem will be mounted it will wait for
s_mmp_interval seconds to make sure that the MMP sequence does not
change. To further make sure, we write a random sequence number into
the MMP block and wait for another s_mmp_interval secs. If the
sequence no. doesn't change then the mount will succeed. In case of
failure, the nodename, bdevname and the time at which the MMP block
was last updated will be displayed. tune2fs can be used to set
s_mmp_interval as desired.
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Kalpak Shah <kalpak@clusterfs.com>
|