diff options
author | Theodore Ts'o <tytso@mit.edu> | 2002-05-11 13:00:22 -0400 |
---|---|---|
committer | Theodore Ts'o <tytso@mit.edu> | 2002-05-11 13:00:22 -0400 |
commit | 583a1ce5d1b225a6b01fc2b30a3bcb21fd0d48c9 (patch) | |
tree | 24d0abe0bbd63441e8b50564e459761bf9c737cd /ext2ed/doc | |
parent | 7de6577cd93041742fad3da0046978793ebb978f (diff) | |
download | e2fsprogs-583a1ce5d1b225a6b01fc2b30a3bcb21fd0d48c9.tar.gz |
Check in ext2ed version 0.1
Diffstat (limited to 'ext2ed/doc')
-rw-r--r-- | ext2ed/doc/Ext2fs-overview-0.1.sgml | 898 | ||||
-rw-r--r-- | ext2ed/doc/ext2ed-design-0.1.sgml | 2102 | ||||
-rw-r--r-- | ext2ed/doc/ext2ed.8 | 72 | ||||
-rw-r--r-- | ext2ed/doc/user-guide-0.1.sgml | 1189 |
4 files changed, 4261 insertions, 0 deletions
diff --git a/ext2ed/doc/Ext2fs-overview-0.1.sgml b/ext2ed/doc/Ext2fs-overview-0.1.sgml new file mode 100644 index 00000000..e1740d46 --- /dev/null +++ b/ext2ed/doc/Ext2fs-overview-0.1.sgml @@ -0,0 +1,898 @@ +<!doctype linuxdoc system> + +<!-- EXT2 filesystem overview --> +<!-- First written: August 1 1995 --> +<!-- Last updated: August 3 1995 --> +<!-- This document is written Using the Linux documentation project Linuxdoc-SGML DTD --> + +<article> + +<title>The extended-2 filesystem overview +<author>Gadi Oxman, tgud@tochnapc2.technion.ac.il +<date>v0.1, August 3 1995 +<toc> + +<!-- Begin of document --> + +<sect>Preface +<p> + +This document attempts to present an overview of the internal structure of +the ext2 filesystem. It was written in summer 95, while I was working on the +<tt>ext2 filesystem editor project (EXT2ED)</>. + +In the process of constructing EXT2ED, I acquired knowledge of the various +design aspects of the the ext2 filesystem. This document is a result of an +effort to document this knowledge. + +This is only the initial version of this document. It is obviously neither +error-prone nor complete, but at least it provides a starting point. + +In the process of learning the subject, I have used the following sources / +tools: +<itemize> +<item> Experimenting with EXT2ED, as it was developed. +<item> The ext2 kernel sources: + <itemize> + <item> The main ext2 include file, + <tt>/usr/include/linux/ext2_fs.h</> + <item> The contents of the directory <tt>/usr/src/linux/fs/ext2</>. + <item> The VFS layer sources (only a bit). + </itemize> +<item> The slides: The Second Extended File System, Current State, Future + Development, by <tt>Remy Card</>. +<item> The slides: Optimisation in File Systems, by <tt>Stephen Tweedie</>. +<item> The various ext2 utilities. +</itemize> + +<sect>Introduction +<p> + +The <tt>Second Extended File System (Ext2fs)</> is very popular among Linux +users. If you use Linux, chances are that you are using the ext2 filesystem. + +Ext2fs was designed by <tt>Remy Card</> and <tt>Wayne Davison</>. It was +implemented by <tt>Remy Card</> and was further enhanced by <tt>Stephen +Tweedie</> and <tt>Theodore Ts'o</>. + +The ext2 filesystem is still under development. I will document here +version 0.5a, which is distributed along with Linux 1.2.x. At this time of +writing, the most recent version of Linux is 1.3.13, and the version of the +ext2 kernel source is 0.5b. A lot of fancy enhancements are planned for the +ext2 filesystem in Linux 1.3, so stay tuned. + +<sect>A filesystem - Why do we need it ? +<p> + +I thought that before we dive into the various small details, I'll reserve a +few minutes for the discussion of filesystems from a general point of view. + +A <tt>filesystem</> consists of two word - <tt>file</> and <tt>system</>. + +Everyone knows the meaning of the word <tt>file</> - A bunch of data put +somewhere. where ? This is an important question. I, for example, usually +throw almost everything into a single drawer, and have difficulties finding +something later. + +This is where the <tt>system</> comes in - Instead of just throwing the data +to the device, we generalize and construct a <tt>system</> which will +virtualize for us a nice and ordered structure in which we could arrange our +data in much the same way as books are arranged in a library. The purpose of +the filesystem, as I understand it, is to make it easy for us to update and +maintain our data. + +Normally, by <tt>mounting</> filesystems, we just use the nice and logical +virtual structure. However, the disk knows nothing about that - The device +driver views the disk as a large continuous paper in which we can write notes +wherever we wish. It is the task of the filesystem management code to store +bookkeeping information which will serve the kernel for showing us the nice +and ordered virtual structure. + +In this document, we consider one particular administrative structure - The +Second Extended Filesystem. + +<sect>The Linux VFS layer +<p> + +When Linux was first developed, it supported only one filesystem - The +<tt>Minix</> filesystem. Today, Linux has the ability to support several +filesystems concurrently. This was done by the introduction of another layer +between the kernel and the filesystem code - The Virtual File System (VFS). + +The kernel "speaks" with the VFS layer. The VFS layer passes the kernel's +request to the proper filesystem management code. I haven't learned much of +the VFS layer as I didn't need it for the construction of EXT2ED so that I +can't elaborate on it. Just be aware that it exists. + +<sect>About blocks and block groups +<p> + +In order to ease management, the ext2 filesystem logically divides the disk +into small units called <tt>blocks</>. A block is the smallest unit which +can be allocated. Each block in the filesystem can be <tt>allocated</> or +<tt>free</>. +<footnote> +The Ext2fs source code refers to the concept of <tt>fragments</>, which I +believe are supposed to be sub-block allocations. As far as I know, +fragments are currently unsupported in Ext2fs. +</footnote> +The block size can be selected to be 1024, 2048 or 4096 bytes when creating +the filesystem. + +Ext2fs groups together a fixed number of sequential blocks into a <tt>group +block</>. The resulting situation is that the filesystem is managed as a +series of group blocks. This is done in order to keep related information +physically close on the disk and to ease the management task. As a result, +much of the filesystem management reduces to management of a single blocks +group. + +<sect>The view of inodes from the point of view of a blocks group +<p> + +Each file in the filesystem is reserved a special <tt>inode</>. I don't want +to explain inodes now. Rather, I would like to treat it as another resource, +much like a <tt>block</> - Each blocks group contains a limited number of +inode, while any specific inode can be <tt>allocated</> or +<tt>unallocated</>. + +<sect>The group descriptors +<p> + +Each blocks group is accompanied by a <tt>group descriptor</>. The group +descriptor summarizes some necessary information about the specific group +block. Follows the definition of the group descriptor, as defined in +/usr/include/linux/ext2_fs.h: + +<tscreen><code> +struct ext2_group_desc +{ + __u32 bg_block_bitmap; /* Blocks bitmap block */ + __u32 bg_inode_bitmap; /* Inodes bitmap block */ + __u32 bg_inode_table; /* Inodes table block */ + __u16 bg_free_blocks_count; /* Free blocks count */ + __u16 bg_free_inodes_count; /* Free inodes count */ + __u16 bg_used_dirs_count; /* Directories count */ + __u16 bg_pad; + __u32 bg_reserved[3]; +}; +</code></tscreen> + +The last three variables: <tt>bg_free_blocks_count, bg_free_inodes_count and +bg_used_dirs_count</> provide statistics about the use of the three +resources in a blocks group - The <tt>blocks</>, the <tt>inodes</> and the +<tt>directories</>. I believe that they are used by the kernel for balancing +the load between the various blocks groups. + +<tt>bg_block_bitmap</> contains the block number of the <tt>block allocation +bitmap block</>. This is used to allocate / deallocate each block in the +specific blocks group. + +<tt>bg_inode_bitmap</> is fully analogous to the previous variable - It +contains the block number of the <tt>inode allocation bitmap block</>, which +is used to allocate / deallocate each specific inode in the filesystem. + +<tt>bg_inode_table</> contains the block number of the start of the +<tt>inode table of the current blocks group</>. The <tt>inode table</> is +just the actual inodes which are reserved for the current block. + +The block bitmap block, inode bitmap block and the inode table are created +when the filesystem is created. + +The group descriptors are placed one after the other. Together they make the +<tt>group descriptors table</>. + +Each blocks group contains the entire table of group descriptors in its +second block, right after the superblock. However, only the first copy (in +group 0) is actually used by the kernel. The other copies are there for +backup purposes and can be of use if the main copy gets corrupted. + +<sect>The block bitmap allocation block +<p> + +Each blocks group contains one special block which is actually a map of the +entire blocks in the group, with respect to their allocation status. Each +<tt>bit</> in the block bitmap indicated whether a specific block in the +group is used or free. + +The format is actually quite simple - Just view the entire block as a series +of bits. For example, + +Suppose the block size is 1024 bytes. As such, there is a place for +1024*8=8192 blocks in a group block. This number is one of the fields in the +filesystem's <tt>superblock</>, which will be explained later. + +<itemize> +<item> Block 0 in the blocks group is managed by bit 0 of byte 0 in the bitmap + block. +<item> Block 7 in the blocks group is managed by bit 7 of byte 0 in the bitmap + block. +<item> Block 8 in the blocks group is managed by bit 0 of byte 1 in the bitmap + block. +<item> Block 8191 in the blocks group is managed by bit 7 of byte 1023 in the + bitmap block. +</itemize> + +A value of "<tt>1</>" in the appropriate bit signals that the block is +allocated, while a value of "<tt>0</>" signals that the block is +unallocated. + +You will probably notice that typically, all the bits in a byte contain the +same value, making the byte's value <tt>0</> or <tt>0ffh</>. This is done by +the kernel on purpose in order to group related data in physically close +blocks, since the physical device is usually optimized to handle such a close +relationship. + +<sect>The inode allocation bitmap +<p> + +The format of the inode allocation bitmap block is exactly like the format of +the block allocation bitmap block. The explanation above is valid here, with +the work <tt>block</> replaced by <tt>inode</>. Typically, there are much less +inodes then blocks in a blocks group and thus only part of the inode bitmap +block is used. The number of inodes in a blocks group is another variable +which is listed in the <tt>superblock</>. + +<sect>On the inode and the inode tables +<p> + +An inode is a main resource in the ext2 filesystem. It is used for various +purposes, but the main two are: +<itemize> +<item> Support of files +<item> Support of directories +</itemize> + +Each file, for example, will allocate one inode from the filesystem +resources. + +An ext2 filesystem has a total number of available inodes which is determined +while creating the filesystem. When all the inodes are used, for example, you +will not be able to create an additional file even though there will still +be free blocks on the filesystem. + +Each inode takes up 128 bytes in the filesystem. By default, <tt>mke2fs</> +reserves an inode for each 4096 bytes of the filesystem space. + +The inodes are placed in several tables, each of which contains the same +number of inodes and is placed at a different blocks group. The goal is to +place inodes and their related files in the same blocks group because of +locality arguments. + +The number of inodes in a blocks group is available in the superblock variable +<tt>s_inodes_per_group</>. For example, if there are 2000 inodes per group, +group 0 will contain the inodes 1-2000, group 2 will contain the inodes +2001-4000, and so on. + +Each inode table is accessed from the group descriptor of the specific +blocks group which contains the table. + +Follows the structure of an inode in Ext2fs: + +<tscreen><code> +struct ext2_inode { + __u16 i_mode; /* File mode */ + __u16 i_uid; /* Owner Uid */ + __u32 i_size; /* Size in bytes */ + __u32 i_atime; /* Access time */ + __u32 i_ctime; /* Creation time */ + __u32 i_mtime; /* Modification time */ + __u32 i_dtime; /* Deletion Time */ + __u16 i_gid; /* Group Id */ + __u16 i_links_count; /* Links count */ + __u32 i_blocks; /* Blocks count */ + __u32 i_flags; /* File flags */ + union { + struct { + __u32 l_i_reserved1; + } linux1; + struct { + __u32 h_i_translator; + } hurd1; + struct { + __u32 m_i_reserved1; + } masix1; + } osd1; /* OS dependent 1 */ + __u32 i_block[EXT2_N_BLOCKS];/* Pointers to blocks */ + __u32 i_version; /* File version (for NFS) */ + __u32 i_file_acl; /* File ACL */ + __u32 i_dir_acl; /* Directory ACL */ + __u32 i_faddr; /* Fragment address */ + union { + struct { + __u8 l_i_frag; /* Fragment number */ + __u8 l_i_fsize; /* Fragment size */ + __u16 i_pad1; + __u32 l_i_reserved2[2]; + } linux2; + struct { + __u8 h_i_frag; /* Fragment number */ + __u8 h_i_fsize; /* Fragment size */ + __u16 h_i_mode_high; + __u16 h_i_uid_high; + __u16 h_i_gid_high; + __u32 h_i_author; + } hurd2; + struct { + __u8 m_i_frag; /* Fragment number */ + __u8 m_i_fsize; /* Fragment size */ + __u16 m_pad1; + __u32 m_i_reserved2[2]; + } masix2; + } osd2; /* OS dependent 2 */ +}; +</code></tscreen> + +<sect1>The allocated blocks +<p> + +The basic functionality of an inode is to group together a series of +allocated blocks. There is no limitation on the allocated blocks - Each +block can be allocated to each inode. Nevertheless, block allocation will +usually be done in series to take advantage of the locality principle. + +The inode is not always used in that way. I will now explain the allocation +of blocks, assuming that the current inode type indeed refers to a list of +allocated blocks. + +It was found experimently that many of the files in the filesystem are +actually quite small. To take advantage of this effect, the kernel provides +storage of up to 12 block numbers in the inode itself. Those blocks are +called <tt>direct blocks</>. The advantage is that once the kernel has the +inode, it can directly access the file's blocks, without an additional disk +access. Those 12 blocks are directly specified in the variables +<tt>i_block[0] to i_block[11]</>. + +<tt>i_block[12]</> is the <tt>indirect block</> - The block pointed by +i_block[12] will <tt>not</> be a data block. Rather, it will just contain a +list of direct blocks. For example, if the block size is 1024 bytes, since +each block number is 4 bytes long, there will be place for 256 indirect +blocks. That is, block 13 till block 268 in the file will be accessed by the +<tt>indirect block</> method. The penalty in this case, compared to the +direct blocks case, is that an additional access to the device is needed - +We need <tt>two</> accesses to reach the required data block. + +In much the same way, <tt>i_block[13]</> is the <tt>double indirect block</> +and <tt>i_block[14]</> is the <tt>triple indirect block</>. + +<tt>i_block[13]</> points to a block which contains pointers to indirect +blocks. Each one of them is handled in the way described above. + +In much the same way, the triple indirect block is just an additional level +of indirection - It will point to a list of double indirect blocks. + +<sect1>The i_mode variable +<p> + +The i_mode variable is used to determine the <tt>inode type</> and the +associated <tt>permissions</>. It is best described by representing it as an +octal number. Since it is a 16 bit variable, there will be 6 octal digits. +Those are divided into two parts - The rightmost 4 digits and the leftmost 2 +digits. + +<sect2>The rightmost 4 octal digits +<p> + +The rightmost 4 digits are <tt>bit options</> - Each bit has its own +purpose. + +The last 3 digits (Octal digits 0,1 and 2) are just the usual permissions, +in the known form <tt>rwxrwxrwx</>. Digit 2 refers to the user, digit 1 to +the group and digit 2 to everyone else. They are used by the kernel to grant +or deny access to the object presented by this inode. +<footnote> +A <tt>smarter</> permissions control is one of the enhancements planned for +Linux 1.3 - The ACL (Access Control Lists). Actually, from browsing of the +kernel source, some of the ACL handling is already done. +</footnote> + +Bit number 9 signals that the file (I'll refer to the object presented by +the inode as file even though it can be a special device, for example) is +<tt>set VTX</>. I still don't know what is the meaning of "VTX". + +Bit number 10 signals that the file is <tt>set group id</> - I don't know +exactly the meaning of the above either. + +Bit number 11 signals that the file is <tt>set user id</>, which means that +the file will run with an effective user id root. + +<sect2>The leftmost two octal digits +<p> + +Note the the leftmost octal digit can only be 0 or 1, since the total number +of bits is 16. + +Those digits, as opposed to the rightmost 4 digits, are not bit mapped +options. They determine the type of the "file" to which the inode belongs: +<itemize> +<item> <tt>01</> - The file is a <tt>FIFO</>. +<item> <tt>02</> - The file is a <tt>character device</>. +<item> <tt>04</> - The file is a <tt>directory</>. +<item> <tt>06</> - The file is a <tt>block device</>. +<item> <tt>10</> - The file is a <tt>regular file</>. +<item> <tt>12</> - The file is a <tt>symbolic link</>. +<item> <tt>14</> - The file is a <tt>socket</>. +</itemize> + +<sect1>Time and date +<p> + +Linux records the last time in which various operations occured with the +file. The time and date are saved in the standard C library format - The +number of seconds which passed since 00:00:00 GMT, January 1, 1970. The +following times are recorded: +<itemize> +<item> <tt>i_ctime</> - The time in which the inode was last allocated. In + other words, the time in which the file was created. +<item> <tt>i_mtime</> - The time in which the file was last modified. +<item> <tt>i_atime</> - The time in which the file was last accessed. +<item> <tt>i_dtime</> - The time in which the inode was deallocated. In + other words, the time in which the file was deleted. +</itemize> + +<sect1>i_size +<p> + +<tt>i_size</> contains information about the size of the object presented by +the inode. If the inode corresponds to a regular file, this is just the size +of the file in bytes. In other cases, the interpretation of the variable is +different. + +<sect1>User and group id +<p> + +The user and group id of the file are just saved in the variables +<tt>i_uid</> and <tt>i_gid</>. + +<sect1>Hard links +<p> + +Later, when we'll discuss the implementation of directories, it will be +explained that each <tt>directory entry</> points to an inode. It is quite +possible that a <tt>single inode</> will be pointed to from <tt>several</> +directories. In that case, we say that there exist <tt>hard links</> to the +file - The file can be accessed from each of the directories. + +The kernel keeps track of the number of hard links in the variable +<tt>i_links_count</>. The variable is set to "1" when first allocating the +inode, and is incremented with each additional link. Deletion of a file will +delete the current directory entry and will decrement the number of links. +Only when this number reaches zero, the inode will be actually deallocated. + +The name <tt>hard link</> is used to distinguish between the alias method +described above, to another alias method called <tt>symbolic linking</>, +which will be described later. + +<sect1>The Ext2fs extended flags +<p> + +The ext2 filesystem associates additional flags with an inode. The extended +attributes are stored in the variable <tt>i_flags</>. <tt>i_flags</> is a 32 +bit variable. Only the 7 rightmost bits are defined. Of them, only 5 bits +are used in version 0.5a of the filesystem. Specifically, the +<tt>undelete</> and the <tt>compress</> features are not implemented, and +are to be introduced in Linux 1.3 development. + +The currently available flags are: +<itemize> +<item> bit 0 - Secure deletion. + + When this bit is on, the file's blocks are zeroed when the file is + deleted. With this bit off, they will just be left with their + original data when the inode is deallocated. +<item> bit 1 - Undelete. + + This bit is not supported yet. It will be used to provide an + <tt>undelete</> feature in future Ext2fs developments. +<item> bit 2 - Compress file. + + This bit is also not supported. The plan is to offer "compression on + the fly" in future releases. +<item> bit 3 - Synchronous updates. + + With this bit on, the meta-data will be written synchronously to the + disk, as if the filesystem was mounted with the "sync" mount option. +<item> bit 4 - Immutable file. + + When this bit is on, the file will stay as it is - Can not be + changed, deleted, renamed, no hard links, etc, before the bit is + cleared. +<item> bit 5 - Append only file. + + With this option active, data will only be appended to the file. +<item> bit 6 - Do not dump this file. + + I think that this bit is used by the port of dump to linux (ported by + <tt>Remy Card</>) to check if the file should not be dumped. +</itemize> + +<sect1>Symbolic links +<p> + +The <tt>hard links</> presented above are just another pointers to the same +inode. The important aspect is that the inode number is <tt>fixed</> when +the link is created. This means that the implementation details of the +filesystem are visible to the user - In a pure abstract usage of the +filesystem, the user should not care about inodes. + +The above causes several limitations: +<itemize> +<item> Hard links can be done only in the same filesystem. This is obvious, + since a hard link is just an inode number in some directory entry, + and the above elements are filesystem specific. +<item> You can not "replace" the file which is pointed to by the hard link + after the link creation. "Replacing" the file in one directory will + still leave the original file in the other directory - The + "replacement" will not deallocate the original inode, but rather + allocate another inode for the new version, and the directory entry + at the other place will just point to the old inode number. +</itemize> + +<tt>Symbolic link</>, on the other hand, is analyzed at <tt>run time</>. A +symbolic link is just a <tt>pathname</> which is accessible from an inode. +As such, it "speaks" in the language of the abstract filesystem. When the +kernel reaches a symbolic link, it will <tt>follow it in run time</> using +its normal way of reaching directories. + +As such, symbolic link can be made <tt>across different filesystems</> and a +replacement of a file with a new version will automatically be active on all +its symbolic links. + +The disadvantage is that hard link doesn't consume space except to a small +directory entry. Symbolic link, on the other hand, consumes at least an +inode, and can also consume one block. + +When the inode is identified as a symbolic link, the kernel needs to find +the path to which it points. + +<sect2>Fast symbolic links +<p> + +When the pathname contains up to 64 bytes, it can be saved directly in the +inode, on the <tt>i_block[0] - i_block[15]</> variables, since those are not +needed in that case. This is called <tt>fast</> symbolic link. It is fast +because the pathname resolution can be done using the inode itself, without +accessing additional blocks. It is also economical, since it allocates only +an inode. The length of the pathname is stored in the <tt>i_size</> +variable. + +<sect2>Slow symbolic links +<p> + +Starting from 65 bytes, additional block is allocated (by the use of +<tt>i_block[0]</>) and the pathname is stored in it. It is called slow +because the kernel needs to read additional block to resolve the pathname. +The length is again saved in <tt>i_size</>. + +<sect1>i_version +<p> + +<tt>i_version</> is used with regard to Network File System. I don't know +its exact use. + +<sect1>Reserved variables +<p> + +As far as I know, the variables which are connected to ACL and fragments +are not currently used. They will be supported in future versions. + +Ext2fs is being ported to other operating systems. As far as I know, +at least in linux, the os dependent variables are also not used. + +<sect1>Special reserved inodes +<p> + +The first ten inodes on the filesystem are special inodes: +<itemize> +<item> Inode 1 is the <tt>bad blocks inode</> - I believe that its data + blocks contain a list of the bad blocks in the filesystem, which + should not be allocated. +<item> Inode 2 is the <tt>root inode</> - The inode of the root directory. + It is the starting point for reaching a known path in the filesystem. +<item> Inode 3 is the <tt>acl index inode</>. Access control lists are + currently not supported by the ext2 filesystem, so I believe this + inode is not used. +<item> Inode 4 is the <tt>acl data inode</>. Of course, the above applies + here too. +<item> Inode 5 is the <tt>boot loader inode</>. I don't know its + usage. +<item> Inode 6 is the <tt>undelete directory inode</>. It is also a + foundation for future enhancements, and is currently not used. +<item> Inodes 7-10 are <tt>reserved</> and currently not used. +</itemize> + +<sect>Directories +<p> + +A directory is implemented in the same way as files are implemented (with +the direct blocks, indirect blocks, etc) - It is just a file which is +formatted with a special format - A list of directory entries. + +Follows the definition of a directory entry: + +<tscreen><code> +struct ext2_dir_entry { + __u32 inode; /* Inode number */ + __u16 rec_len; /* Directory entry length */ + __u16 name_len; /* Name length */ + char name[EXT2_NAME_LEN]; /* File name */ +}; +</code></tscreen> + +Ext2fs supports file names of varying lengths, up to 255 bytes. The +<tt>name</> field above just contains the file name. Note that it is +<tt>not zero terminated</>; Instead, the variable <tt>name_len</> contains +the length of the file name. + +The variable <tt>rec_len</> is provided because the directory entries are +padded with zeroes so that the next entry will be in an offset which is +a multiplition of 4. The resulting directory entry size is stored in +<tt>rec_len</>. If the directory entry is the last in the block, it is +padded with zeroes till the end of the block, and rec_len is updated +accordingly. + +The <tt>inode</> variable points to the inode of the above file. + +Deletion of directory entries is done by appending of the deleted entry +space to the previous (or next, I am not sure) entry. + +<sect>The superblock +<p> + +The <tt>superblock</> is a block which contains information which describes +the state of the internal filesystem. + +The superblock is located at the <tt>fixed offset 1024</> in the device. Its +length is 1024 bytes also. + +The superblock, like the group descriptors, is copied on each blocks group +boundary for backup purposes. However, only the main copy is used by the +kernel. + +The superblock contain three types of information: +<itemize> +<item> Filesystem parameters which are fixed and which were determined when + this specific filesystem was created. Some of those parameters can + be different in different installations of the ext2 filesystem, but + can not be changed once the filesystem was created. +<item> Filesystem parameters which are tunable - Can always be changed. +<item> Information about the current filesystem state. +</itemize> + +Follows the superblock definition: + +<tscreen><code> +struct ext2_super_block { + __u32 s_inodes_count; /* Inodes count */ + __u32 s_blocks_count; /* Blocks count */ + __u32 s_r_blocks_count; /* Reserved blocks count */ + __u32 s_free_blocks_count; /* Free blocks count */ + __u32 s_free_inodes_count; /* Free inodes count */ + __u32 s_first_data_block; /* First Data Block */ + __u32 s_log_block_size; /* Block size */ + __s32 s_log_frag_size; /* Fragment size */ + __u32 s_blocks_per_group; /* # Blocks per group */ + __u32 s_frags_per_group; /* # Fragments per group */ + __u32 s_inodes_per_group; /* # Inodes per group */ + __u32 s_mtime; /* Mount time */ + __u32 s_wtime; /* Write time */ + __u16 s_mnt_count; /* Mount count */ + __s16 s_max_mnt_count; /* Maximal mount count */ + __u16 s_magic; /* Magic signature */ + __u16 s_state; /* File system state */ + __u16 s_errors; /* Behaviour when detecting errors */ + __u16 s_pad; + __u32 s_lastcheck; /* time of last check */ + __u32 s_checkinterval; /* max. time between checks */ + __u32 s_creator_os; /* OS */ + __u32 s_rev_level; /* Revision level */ + __u16 s_def_resuid; /* Default uid for reserved blocks */ + __u16 s_def_resgid; /* Default gid for reserved blocks */ + __u32 s_reserved[235]; /* Padding to the end of the block */ +}; +</code></tscreen> + +<sect1>superblock identification +<p> + +The ext2 filesystem's superblock is identified by the <tt>s_magic</> field. +The current ext2 magic number is 0xEF53. I presume that "EF" means "Extended +Filesystem". In versions of the ext2 filesystem prior to 0.2B, the magic +number was 0xEF51. Those filesystems are not compatible with the current +versions; Specifically, the group descriptors definition is different. I +doubt if there still exists such a installation. + +<sect1>Filesystem fixed parameters +<p> + +By using the word <tt>fixed</>, I mean fixed with respect to a particular +installation. Those variables are usually not fixed with respect to +different installations. + +The <tt>block size</> is determined by using the <tt>s_log_block_size</> +variable. The block size is 1024*pow (2,s_log_block_size) and should be +between 1024 and 4096. The available options are 1024, 2048 and 4096. + +<tt>s_inodes_count</> contains the total number of available inodes. + +<tt>s_blocks_count</> contains the total number of available blocks. + +<tt>s_first_data_block</> specifies in which of the <tt>device block</> the +<tt>superblock</> is present. The superblock is always present at the fixed +offset 1024, but the device block numbering can differ. For example, if the +block size is 1024, the superblock will be at <tt>block 1</> with respect to +the device. However, if the block size is 4096, offset 1024 is included in +<tt>block 0</> of the device, and in that case <tt>s_first_data_block</> +will contain 0. At least this is how I understood this variable. + +<tt>s_blocks_per_group</> contains the number of blocks which are grouped +together as a blocks group. + +<tt>s_inodes_per_group</> contains the number of inodes available in a group +block. I think that this is always the total number of inodes divided by the +number of blocks groups. + +<tt>s_creator_os</> contains a code number which specifies the operating +system which created this specific filesystem: +<itemize> +<item> <tt>Linux</> :-) is specified by the value <tt>0</>. +<item> <tt>Hurd</> is specified by the value <tt>1</>. +<item> <tt>Masix</> is specified by the value <tt>2</>. +</itemize> + +<tt>s_rev_level</> contains the major version of the ext2 filesystem. +Currently this is always <tt>0</>, as the most recent version is 0.5B. It +will probably take some time until we reach version 1.0. + +As far as I know, fragments (sub-block allocations) are currently not +supported and hence a block is equal to a fragment. As a result, +<tt>s_log_frag_size</> and <tt>s_frags_per_group</> are always equal to +<tt>s_log_block_size</> and <tt>s_blocks_per_group</>, respectively. + +<sect1>Ext2fs error handling +<p> + +The ext2 filesystem error handling is based on the following philosophy: +<enum> +<item> Identification of problems is done by the kernel code. +<item> The correction task is left to an external utility, such as + <tt>e2fsck by Theodore Ts'o</> for <tt>automatic</> analysis and + correction, or perhaps <tt>debugfs by Theodore Ts'o</> and + <tt>EXT2ED by myself</>, for <tt>hand</> analysis and correction. +</enum> + +The <tt>s_state</> variable is used by the kernel to pass the identification +result to third party utilities: +<itemize> +<item> <tt>bit 0</> of s_state is reset when the partition is mounted and + set when the partition is unmounted. Thus, a value of 0 on an + unmounted filesystem means that the filesystem was not unmounted + properly - The filesystem is not "clean" and probably contains + errors. +<item> <tt>bit 1</> of s_state is set by the kernel when it detects an + error in the filesystem. A value of 0 doesn't mean that there isn't + an error in the filesystem, just that the kernel didn't find any. +</itemize> + +The kernel behavior when an error is found is determined by the user tunable +parameter <tt>s_errors</>: +<itemize> +<item> The kernel will ignore the error and continue if <tt>s_errors=1</>. +<item> The kernel will remount the filesystem in read-only mode if + <tt>s_errors=2</>. +<item> A kernel panic will be issued if <tt>s_errors=3</>. +</itemize> + +The default behavior is to ignore the error. + +<sect1>Additional parameters used by e2fsck +<p> + +Of-course, <tt>e2fsck</> will check the filesystem if errors were detected +or if the filesystem is not clean. + +In addition, each time the filesystem is mounted, <tt>s_mnt_count</> is +incremented. When s_mnt_count reaches <tt>s_max_mnt_count</>, <tt>e2fsck</> +will force a check on the filesystem even though it may be clean. It will +then zero s_mnt_count. <tt>s_max_mnt_count</> is a tunable parameter. + +E2fsck also records the last time in which the file system was checked in +the <tt>s_lastcheck</> variable. The user tunable parameter +<tt>s_checkinterval</> will contain the number of seconds which are allowed +to pass since <tt>s_lastcheck</> until a check is reforced. A value of +<tt>0</> disables time-based check. + +<sect1>Additional user tunable parameters +<p> + +<tt>s_r_blocks_count</> contains the number of disk blocks which are +reserved for root, the user whose id number is <tt>s_def_resuid</> and the +group whose id number is <tt>s_deg_resgid</>. The kernel will refuse to +allocate those last <tt>s_r_blocks_count</> if the user is not one of the +above. This is done so that the filesystem will usually not be 100% full, +since 100% full filesystems can affect various aspects of operation. + +<tt>s_def_resuid</> and <tt>s_def_resgid</> contain the id of the user and +of the group who can use the reserved blocks in addition to root. + +<sect1>Filesystem current state +<p> + +<tt>s_free_blocks_count</> contains the current number of free blocks +in the filesystem. + +<tt>s_free_inodes_count</> contains the current number of free inodes in the +filesystem. + +<tt>s_mtime</> contains the time at which the system was last mounted. + +<tt>s_wtime</> contains the last time at which something was changed in the +filesystem. + +<sect>Copyright +<p> + +This document contains source code which was taken from the Linux ext2 +kernel source code, mainly from /usr/include/linux/ext2_fs.h. Follows +the original copyright: + +<tscreen><verb> +/* + * linux/include/linux/ext2_fs.h + * + * Copyright (C) 1992, 1993, 1994, 1995 + * Remy Card (card@masi.ibp.fr) + * Laboratoire MASI - Institut Blaise Pascal + * Universite Pierre et Marie Curie (Paris VI) + * + * from + * + * linux/include/linux/minix_fs.h + * + * Copyright (C) 1991, 1992 Linus Torvalds + */ + +</verb></tscreen> + +<sect>Acknowledgments +<p> + +I would like to thank the following people, who were involved in the +design and implementation of the ext2 filesystem kernel code and support +utilities: +<itemize> +<item> <tt>Remy Card</> + + Who designed, implemented and maintains the ext2 filesystem kernel + code, and some of the ext2 utilities. <tt>Remy Card</> is also the + author of several helpful slides concerning the ext2 filesystem. + Specifically, he is the author of <tt>File Management in the Linux + Kernel</> and of <tt>The Second Extended File System - Current + State, Future Development</>. + +<item> <tt>Wayne Davison</> + + Who designed the ext2 filesystem. +<item> <tt>Stephen Tweedie</> + + Who helped designing the ext2 filesystem kernel code and wrote the + slides <tt>Optimizations in File Systems</>. +<item> <tt>Theodore Ts'o</> + + Who is the author of several ext2 utilities and of the ext2 library + <tt>libext2fs</> (which I didn't use, simply because I didn't know + it exists when I started to work on my project). +</itemize> + +Lastly, I would like to thank, of-course, <tt>Linus Torvalds</> and the +<tt>Linux community</> for providing all of us with such a great operating +system. + +Please contact me in a case of an error report, suggestions, or just about +anything concerning this document. + +Enjoy, + +Gadi Oxman <tgud@tochnapc2.technion.ac.il> + +Haifa, August 95 +</article>
\ No newline at end of file diff --git a/ext2ed/doc/ext2ed-design-0.1.sgml b/ext2ed/doc/ext2ed-design-0.1.sgml new file mode 100644 index 00000000..ba1bd7aa --- /dev/null +++ b/ext2ed/doc/ext2ed-design-0.1.sgml @@ -0,0 +1,2102 @@ +<!doctype linuxdoc system> + +<!-- EXT2ED - Project notes --> +<!-- First written: July 25 1995 --> +<!-- Last updated: August 3 1995 --> +<!-- This document is written Using the Linux documentation project Linuxdoc-SGML DTD --> + +<article> + +<title>EXT2ED - The Extended-2 filesystem editor - Design and implementation +<author>Programmed by Gadi Oxman, with the guide of Avner Lottem +<date>v0.1, August 3 1995 +<toc> + +<!-- Begin of document --> + +<sect>About EXT2ED documentation +<p> + +The EXT2ED documentation consists of three parts: +<itemize> +<item> The ext2 filesystem overview. +<item> The EXT2ED user's guide. +<item> The EXT2ED design and implementation. +</itemize> + +This document is not the user's guide. If you just intend to use EXT2ED, you +may not want to read it. + +However, if you intend to browse and modify the source code, this document is +for you. + +In any case, If you intend to read this article, I strongly suggest that you +will be familiar with the material presented in the other two articles as well. + +<sect>Preface +<p> + +In this document I will try to explain how EXT2ED is constructed. +At this time of writing, the initial version is finished and ready +for distribution; It is fully functional. However, this was not always the +case. + +At first, I didn't know much about Unix, much less about Unix filesystems, +and even less about Linux and the extended-2 filesystem. While working +on this project, I gradually acquired knowledge about all of the above +subjects. I can think of two ways in which I could have made my project: +<enum> +<item> The "Engineer" way + + Learn the subject throughly before I get to the programming itself. + Then, I could easily see the entire picture and select the best + course of action, taking all the factors into account. +<item> The "Explorer - Progressive" way. + + Jump immediately into the cold water - Start programming and + learning the material parallelly. +</enum> + +I guess that the above dilemma is typical and appears all through science and +technology. + +However, I didn't have the luxury of choice when I started my project - +Linux is a relatively new (and great !) operating system. The extended-2 +filesystem is even newer - Its first release lies somewhere in 1993 - Only +passed two years until I started working on my project. + +The situation I found myself at the beginning was that I didn't have a fully +detailed document which describes the ext2 filesystem. In fact, I didn't +have any ext2 document at all. When I asked Avner about documentation, he +suggested two references: +<itemize> +<item> A general Unix book - THE DESIGN OF THE UNIX OPERATING SYSTEM, by + Maurice J. Bach. +<item> The kernel sources. +</itemize> +I read the relevant parts of the book before I started my project - It is a +bit old now, but the principles are still the same. However, I needed +more than just the principles. + +The kernel sources are a rare bonus ! You don't get everyday the full +sources of the operating system. There is so much that can be learned from +them, and it is the ultimate source - The exact answer how the kernel +works is there, with all the fine details. At the first week I started to +look at random at the relevant parts of the sources. However, it is difficult +to understand the global picture from direct reading of over one hundred +page sources. Then, I started to do some programming. I didn't know +yet what I was looking for, and I started to work on the project like a kid +who starts to build a large puzzle. + +However, this was exactly the interesting part ! It is frustrating to know +it all from advance - I think that the discovery itself, bit by bit, is the +key to a true learning and understanding. + +Now, in this document, I am trying to present the subject. Even though I +developed EXT2ED progressively, I now can see the entire subject much +brighter than I did before, and though I do have the option of presenting it +only in the "engineer" way. However, I will not do that. + +My presentation will be mixed - Sometimes I will present a subject with an +incremental perspective, and sometimes from a "top down" view. I'll leave +you to decide if my presentation choice was wise :-) + +In addition, you'll notice that the sections tend to get shorter as we get +closer to the end. The reason is simply that I started to feel that I was +repeating myself so I decided to present only the new ideas. + +<sect>Getting started ... +<p> + +Getting started is almost always the most difficult task. Once you get +started, things start "running" ... + +<sect1>Before the actual programming +<p> + +From mine talking with Avner, I understood that Linux, like any other Unix +system, provides accesses to the entire disk as though it were a general +file - Accessing the device. It is surely a nice idea. Avner suggested two +ways of action: +<itemize> +<item> Opening the device like a regular file in the user space. +<item> Constructing a device driver which will run in the kernel space and + provide hooks for the user space program. The advantage is that it + will be a part of the kernel, and would be able to use the ext2 + kernel functions to do some of the work. +</itemize> +I chose the first way. I think that the basic reason was simplicity - Learning +the ext2 filesystem was complicated enough, and adding to it the task of +learning how to program in the kernel space was too much. I still don't know +how to program a device driver, and this is perhaps the bad part, but +concerning the project in a back-perspective, I think that the first way is +superior to the second; Ironically, because of the very reason I chose it - +Simplicity. EXT2ED can now run entirely in the user space (which I think is +a point in favor, because it doesn't require the user to recompile its +kernel), and the entire hard work is mine, which fitted nicely into the +learning experience - I didn't use other code to do the job (aside from +looking at the sources, of-course). + +<sect1>Jumping into the cold water +<p> + +I didn't know almost anything of the structure of the ext2 filesystem. +Reading the sources was not enough - I needed to experiment. However, a tool +for experiments in the ext2 filesystem was exactly my project ! - Kind of a +paradox. + +I started immediately with constructing a simple <tt>hex editor</> - It would +open the device as a regular file, provide means of moving inside the +filesystem with a simple <tt>offset</> method, and just show a +<tt> hex dump</> of the contents at this point. Programming this was trivially +simple of-course. At this point, the user-interface didn't matter to me - I +wanted a fast way to interact. As a result, I chose a simple command line +parser. Of course, there where no windows at this point. + +A hex editor is nice, but is not enough. It indeed enabled me to see each part +of the filesystem, but the format of the viewed data was difficult to +analyze. I wanted to see the data in a more intuitive way. + +At this point of time, the most helpful file in the sources was the ext2 +main include file - <tt>/usr/include/linux/ext2_fs.h</>. Among its contents +there were various structures which I assumed they are disk images - Appear +exactly like that on the disk. + +I wanted a <tt>quick</> way to get going. I didn't have the patience to learn +each of the structures use in the code. Rather, I wanted to see them in action, +so that I could explore the connections between them - Test my assumptions, +and reach other assumptions. + +So after the <tt>hex editor</>, EXT2ED progressed into a tool which has some +elements of a compiler. I programmed EXT2ED to <tt>dynamically read the kernel +ext2 main include file in run time</>, and process the information. The goal +was to <tt>imply a structure-definition on the current offset at the +filesystem</>. EXT2ED would then display the structure as a list of its +variables names and contents, instead of a meaningless hex dump. + +The format of the include file is not very complicated - The structures +are mostly <tt>flat</> - Didn't contain a lot of recursive structure; Only a +global structure definition, and some variables. There were cases of +structures inside structures, I treated them in a somewhat non-elegant way - I +made all the structures flat, and expanded the arrays. As a result, the parser +was very simple. After all, this was not an exercise in compiling, and I +wanted to quickly get some results. + +To handle the task, I constructed the <tt>struct_descriptor</> structure. +Each <tt>struct_descriptor instance</> contained information which is needed +in order to format a block of data according to the C structure contained in +the kernel source. The information contained: +<itemize> +<item> The descriptor name, used to reference to the structure in EXT2ED. +<item> The name of each variable. +<item> The relative offset of the each variable in the data block. +<item> The length, in bytes, of each variable. +</itemize> +Since I didn't want to limit the number of structures, I chose a simple +double linked list to store the information. One variable contained the +<tt>current structure type</> - A pointer to the relevant +<tt>struct_descriptor</>. + +Now EXT2ED contained basically three command line operations: +<itemize> +<item> setdevice + + Used to open a device for reading only. Write access was postponed + to a very advanced state in the project, simply because I didn't + know a thing of the filesystem structure, and I believed that + making actual changes would do nothing but damage :-) +<item> setoffset + + Used to move in the device. +<item> settype + + Used to imply a structure definition on the current place. +<item> show + + Used to display the data. It displayed the data in a simple hex dump + if there was no type set, or in a nice formatted way - As a list of + the variable contents, if there was. +</itemize> + +Command line analyzing was primitive back then - A simple switch, as far as +I can remember - Nothing alike the current flow control, but it was enough +at the time. + +At the end, I had something to start working with. It knew to format many +structures - None of which I understood - and provided me, without too much +work, something to start with. + +<sect>Starting to explore +<p> + +With the above tool in my pocket, I started to explore the ext2 filesystem +structure. From the brief reading in Bach's book, I got familiar to some +basic concepts - The <tt>superblock</>, for example. It seems that the +superblock is an important part of the filesystem. I decided to start +exploring with that. + +I realized that the superblock should be at a fixed location in the +filesystem - Probably near the beginning. There can be no other way - +The kernel should start at some place to find it. A brief looking in +the kernel sources revealed that the superblock is signed by a special +signature - A <tt>magic number</> - EXT2_SUPER_MAGIC (0xEF53 - EF probably +stands for Extended Filesystem). I quickly found the superblock at the +fixed offset 1024 in the filesystem - The <tt>s_magic</> variable in the +superblock was set exactly to the above value. + +It seems that starting with the <tt>superblock</> was a good bet - Just from +the list of variables, one can learn a lot. I didn't understand all of them +at the time, but it seemed that the following keywords were repeating themself +in various variables: +<itemize> +<item> block +<item> inode +<item> group +</itemize> +At this point, I started to explore the block groups. I will not detail here +the technical design of the ext2 filesystem. I have written a special +article which explains just that, in the "engineering" way. Please refer to it +if you feel that you are lacking knowledge in the structure of the ext2 +filesystem. + +I was exploring the filesystem in this way for some time, along with reading +the sources. This lead naturally to the next step. + +<sect>Object specific commands +<p> + +What has become clear is that the above way of exploring is not powerful +enough - I found myself doing various calculations manually in order to pass +between related structures. I needed to replace some tasks with an automated +procedure. + +In addition, it also became clear that (of-course) each key object in the +filesystem has its special place in regard to the overall ext2 filesystem +design, and needs a <tt>fine tuned handling</>. It is at this point that the +structure definitions <tt>came to life</> - They became <tt>object +definitions</>, making EXT2ED <tt>object oriented</>. + +The actual meaning of the breathtaking words above, is that each structure +now had a list of <tt>private commands</>, which ended up in +<tt>calling special fine-tuned C functions</>. This approach was +found to be very powerful and is <tt>the heart of EXT2ED even now</>. + +In order to implement the above concepts, I added the structure +<tt>struct_commands</>. The role of this structure is to group together a +group of commands, which can be later assigned to a specific type. Each +structure had: +<itemize> +<item> A list of command names. +<item> A list of pointers to functions, which binds each command to its + special fine-tuned C function. +</itemize> +In order to relate a list of commands to a type definition, each +<tt>struct_descriptor</> structure (explained earlier) was added a private +<tt>struct_commands</> structure. + +Follows the current definitions of <tt>struct_descriptor</> and of +<tt>struct_command</>: +<tscreen><code> +struct struct_descriptor { + unsigned long length; + unsigned char name [60]; + unsigned short fields_num; + unsigned char field_names [MAX_FIELDS][80]; + unsigned short field_lengths [MAX_FIELDS]; + unsigned short field_positions [MAX_FIELDS]; + struct struct_commands type_commands; + struct struct_descriptor *prev,*next; +}; + +typedef void (*PF) (char *); + +struct struct_commands { + int last_command; + char *names [MAX_COMMANDS_NUM]; + char *descriptions [MAX_COMMANDS_NUM]; + PF callback [MAX_COMMANDS_NUM]; +}; +</code></tscreen> + +<sect><label id="flow_control">Program flow control +<p> + +Obviously the above approach lead to a major redesign of EXT2ED. The +main engine of the resulting design is basically the same even now. + +I redesigned the program flow control. Up to now, I analyzed the user command +line with the simple switch method. Now I used the far superior callback +method. + +I divided the available user commands into two groups: +<enum> +<item> General commands. +<item> Type specific commands. +</enum> +As a result, at each point in time, the user was able to enter a +<tt>general command</>, selectable from a list of general commands which was +always available, or a <tt>type specific command</>, selectable from a list of +commands which <tt>changed in time</> according to the current type that the +user was editing. The special <tt>type specific command</> "knew" how to +handle the object in the best possible way - It was "fine tuned" for the +object's place in the ext2 filesystem design. + +In order to implement the above idea, I constructed a global variable of +type <tt>struct_commands</>, which contained the <tt>general commands</>. +The <tt>type specific commands</> were accessible through the <tt>struct +descriptors</>, as explained earlier. + +The program flow was now done according to the following algorithm: +<enum> +<item> Ask the user for a command line. +<item> Analyze the user command - Separate it into <tt>command</> and + <tt>arguments</>. +<item> Trace the list of known objects to match the command name to a type. + If the type is found, call the callback function, with the arguments + as a parameter. Then go back to step (1). +<item> If the command is not type specific, try to find it in the general + commands, and call it if found. Go back to step (1). +<item> If the command is not found, issue a short error message, and return + to step (1). +</enum> +Note the <tt>order</> of the above steps. In particular, note that a command +is first assumed to be a type-specific command and only if this fails, a +general command is searched. The "<tt>side-effect</>" (main effect, actually) +is that when we have two commands with the <tt>same name</> - One that is a +type specific command, and one that is a general command, the dispatching +algorithm will call the <tt>type specific command</>. This allows +<tt>overriding</> of a command to provide <tt>fine-tuned</> operation. +For example, the <tt>show</> command is overridden nearly everywhere, +to accommodate for the different ways in which different objects are displayed, +in order to provide an intuitive fine-tuned display. + +The above is done in the <tt>dispatch</> function, in <tt>main.c</>. Since +it is a very important function in EXT2ED, and it is relatively short, I will +list it entirely here. Note that a redesign was made since then - Another +level was added between the two described, but I'll elaborate more on this +later. However, the basic structure follows the explanation described above. +<tscreen><code> +int dispatch (char *command_line) + +{ + int i,found=0; + char command [80]; + + parse_word (command_line,command); + + if (strcmp (command,"quit")==0) return (1); + + /* 1. Search for type specific commands FIRST - Allows overriding of a general command */ + + if (current_type != NULL) + for (i=0;i<=current_type->type_commands.last_command && !found;i++) { + if (strcmp (command,current_type->type_commands.names [i])==0) { + (*current_type->type_commands.callback [i]) (command_line); + found=1; + } + } + + /* 2. Now search for ext2 filesystem general commands */ + + if (!found) + for (i=0;i<=ext2_commands.last_command && !found;i++) { + if (strcmp (command,ext2_commands.names [i])==0) { + (*ext2_commands.callback [i]) (command_line); + found=1; + } + } + + + /* 3. If not found, search the general commands */ + + if (!found) + for (i=0;i<=general_commands.last_command && !found;i++) { + if (strcmp (command,general_commands.names [i])==0) { + (*general_commands.callback [i]) (command_line); + found=1; + } + } + + if (!found) { + wprintw (command_win,"Error: Unknown command\n"); + refresh_command_win (); + } + + return (0); +} +</code></tscreen> + +<sect>Source files in EXT2ED +<p> + +The project was getting large enough to be splitted into several source +files. I splitted the source as much as I could into self-contained +source files. The source files consist of the following blocks: +<itemize> +<item> <tt>Main include file - ext2ed.h</> + + This file contains the definitions of the various structures, + variables and functions used in EXT2ED. It is included by all source + files in EXT2ED. + +<item> <tt>Main block - main.c</> + + <tt>main.c</> handles the upper level of the program flow control. + It contains the <tt>parser</> and the <tt>dispatcher</>. Its task is + to ask the user for a required action, and to pass control to other + lower level functions in order to do the actual job. + +<item> <tt>Initialization - init.c</> + + The init source is responsible for the various initialization + actions which need to be done through the program. For example, + auto detection of an ext2 filesystem when selecting a device and + initialization of the filesystem-specific structures described + earlier. + +<item> <tt>Disk activity - disk.c</> + + <tt>disk.c</> is handles the lower level interaction with the + device. All disk activity is passed through this file - The various + functions through the source code request disk actions from the + functions in this file. In this way, for example, we can easily block + the write access to the device. + +<item> <tt>Display output activity - win.c</> + + In a similar way to <tt>disk.c</>, the user-interface functions and + most of the interaction with the <tt>ncurses library</> are done + here. Nothing will be actually written to a specific window without + calling a function from this file. + +<item> <tt>Commands available through dispatching - *_com.c </> + + The above file name is generic - Each file which ends with + <tt>_com.c</> contains a group of related commands which can be + called through <tt>the dispatching function</>. + + Each object typically has its own file. A separate file is also + available for the general commands. +</itemize> +The entire list of source files available at this time is: +<itemize> +<item> blockbitmap_com.c +<item> dir_com.c +<item> disk.c +<item> ext2_com.c +<item> file_com.c +<item> general_com.c +<item> group_com.c +<item> init.c +<item> inode_com.c +<item> inodebitmap_com.c +<item> main.c +<item> super_com.c +<item> win.c +</itemize> + +<sect>User interface +<p> + +The user interface is text-based only and is based on the following +libraries: + +<itemize> +<item> The <tt>ncurses</> library, developed by <tt>Zeyd Ben-Halim</>. +<item> The <tt>GNU readline</> library. +</itemize> + +The user interaction is command line based - The user enters a command +line, which consists of a <tt>command</> and of <tt>arguments</>. This fits +nicely with the program flow control described earlier - The <tt>command</> +is used by <tt>dispatch</> to select the right function, and the +<tt>arguments</> are interpreted by the function itself. + +<sect1>The ncurses library +<p> + +The <tt>ncurses</> library enables me to divide the screen into "windows". +The main advantage is that I treat the "window" in a virtual way, asking +the ncurses library to "write to a window". However, the ncurses +library internally buffers the requests, and nothing is actually passed to the +terminal until an explicit refresh is requested. When the refresh request is +made, ncurses compares the current terminal state (as known in the last time +that a refresh was done) with the new to be shown state, and passes to the +terminal the minimal information required to update the display. As a +result, the display output is optimized behind the scenes by the +<tt>ncurses</> library, while I can still treat it in a virtual way. + +There are two basic concepts in the <tt>ncurses</> library: +<itemize> +<item> A window. +<item> A pad. +</itemize> +A window can be no bigger than the actual terminal size. A pad, however, is +not limited in its size. + +The user screen is divided by EXT2ED into three windows and one pad: +<itemize> +<item> Title window. +<item> Status window. +<item> Main display pad. +<item> Command window. +</itemize> + +The <tt>title window</> is static - It just displays the current version +of EXT2ED. + +The user interaction is done in the <tt>command window</>. The user enters a +<tt>command line</>, feedback is usually displayed there, and then relevant +data is usually displayed in the main display and in the status window. + +The <tt>main display</> is using a <tt>pad</> instead of a window because +the amount of information which is written to it is not known in advance. +Therefor, the user treats the main display as a "window" into a bigger +display and can <tt>scroll vertically</> using the <tt>pgdn</> and <tt>pgup</> +commands. Although the <tt>pad</> mechanism enables me to use horizontal +scrolling, I have not utilized this. + +When I need to show something to the user, I use the ncurses <tt>wprintw</> +command. Then an explicit refresh command is required. As explained before, +the refresh commands is piped through <tt>win.c</>. For example, to update +the command window, <tt>refresh_command_win ()</> is used. + +<sect1>The readline library +<p> + +Avner suggested me to integrate the GNU <tt>readline</> library in my project. +The <tt>readline</> library is designed specifically for programs which use +command line interface. It provides a nice package of <tt>command line editing +tools</> - Inserting, deleting words, and the whole package of editing tools +which are normally available in the <tt>bash</> shell (Refer to the readline +documentation for details). In addition, I utilized the <tt>history</> +feature of the readline library - The entered commands are saved in a +<tt>command history</>, and can be called later by whatever means that the +readline package provides. Command completion is also supported - When the +user enters a partial command name, EXT2ED will provide the readline library +with the possible completions. + +<sect>Possible support of other filesystems +<p> + +The entire ext2 layer is provided through specific objects. Given another +set of objects, support of other filesystem can be provided using the same +dispatching mechanism. In order to prepare the surface for this option, I +added yet another layer to the two-layer structure presented earlier. EXT2ED +commands now consist of three layers: +<itemize> +<item> The general commands. +<item> The ext2 general commands. +<item> The ext2 object specific commands. +</itemize> +The general commands are provided by the <tt>general_com.c</> source file, +and are always available. The two other levels are not present when EXT2ED +loads - They are dynamically added by <tt>init.c</> when EXT2ED detects an +ext2 filesystem on the device. + +The abstraction levels presented above helps to extend EXT2ED to fully +support a new filesystem, with its own specific type commands. + +Even without any source code modification, the user is free to add structure +definitions in a separate file (specified in the configuration file), +which will be added to the list of available objects. The added objects will +consist only of variables, of-course, and will be used through the more +primitive <tt>setoffset</> and <tt>settype</> commands. + +<sect>On the implementation of the various commands +<p> + +This section points out some typical programming style that I used in many +places at the code. + +<sect1>The explicit use of the dispatch function +<p> + +The various commands are reached by the user through the <tt>dispatch</> +function. This is not surprising. The fact that can be surprising, at least in +a first look, is that <tt>you'll find the <em>dispatch</> call in many of my +own functions !</>. + +I am in fact using my own implemented functions to construct higher +level operations. I am heavily using the fact that the dispatching mechanism +is object oriented ant that the <tt>overriding</> principle takes place and +selects the proper function to call when several commands with the same name +are accessible. + +Sometimes, however, I call the explicit command directly, without passing +through <tt>dispatch</>. This is typically done when I want to bypass the +<tt>overriding</> effect. + +<tscreen><verb> +This is used, for example, in the interaction between the global cd command +and the dir object specific cd command. You will see there that in order +to implement the "entire" cd command, the type specific cd command uses both +a dispatching mechanism to call itself recursively if a relative path is +used, or a direct call of the general cd handling function if an explicit path +is used. +</verb></tscreen> + +<sect1>Passing information between handling functions +<p> + +Typically, every source code file which handles one object type has a global +structure specifically designed for it which is used by most of the +functions in that file. This is used to pass information between the various +functions there, and to physically provide the link to other related +objects, typically for initialization use. + +<tscreen><verb> +For example, in order to edit a file, information about the +inode is needed - The file command is available only when editing an +inode. When the file command is issued, the handling function (found, +according to the source division outlined above, in inode_com.c) will +store the necessary information about the inode in a specific structure +of type struct_file_info which will be available for use by the file_com.c +functions. Only then it will set the type to file. This is also the reason +that a direct asynchronic set of the object type to a file through a settype +command will fail - The above data structure will not be initialized +properly because the user never was at the inode of the file. +</verb></tscreen> + +<sect1>A very simplified overview of a typical command handling function +<p> + +This is a very simplified overview. Detailed information will follow +where appropriate. + +<sect2>The prototype of a typical handling function +<p> + +<enum> +<item> I chose a unified <tt>naming convention</> for the various object + specific commands. It is perhaps best showed with an example: + + The prototype of the handling function of the command <tt>next</> of + the type <tt>file</> is: + <tscreen><verb> + extern void type_file___next (char *command_line); + </verb></tscreen> + + For other types and commands, the words <tt>file</> and <tt>next</> + should be replaced accordingly. + +<item> The ext2 general commands syntax is similar. For example, the ext2 + general command <tt>super</> results in calling: + <tscreen><verb> + extern void type_ext2___super (char *command_line); + </verb></tscreen> + Those functions are available in <tt>ext2_com.c</>. +<item> The general commands syntax is even simpler - The name of the + handling function is exactly the name of the commands. Those + functions are available in <tt>general_com.c</>. +</enum> + +<sect2> "Typical" algorithm +<p> + +This section can't of-course provide meaningful information - Each +command is handled differently, but the following frame is typical: +<enum> +<item> Parse command line arguments and analyze them. Return with an error + message if the syntax is wrong. +<item> "Act accordingly", perhaps making use of the global variable available + to this type. +<item> Use some <tt>dispatch / direct </> calls in order to pass control to + other lower-level user commands. +<item> Sometimes <tt>dispatch</> to the object's <tt>show</> command to + display the resulting data to the user. +</enum> +I told you it is meaningless :-) + +<sect>Initialization overview +<p> + +In this section I will discuss some aspects of the various initialization +routines available in the source file <tt>init.c</>. + +<sect1>Upon startup +<p> + +Follows the function <tt>main</>, appearing of-course in <tt>main.c</>: +<tscreen><code> +int main (void) + +{ + if (!init ()) return (0); /* Perform some initial initialization */ + /* Quit if failed */ + + parser (); /* Get and parse user commands */ + + prepare_to_close (); /* Do some cleanup */ + printf ("Quitting ...\n"); + return (1); /* And quit */ +} +</code></tscreen> + +The two initialization functions, which are called by <tt>main</>, are: +<itemize> +<item> init +<item> prepare_to_close +</itemize> + +<sect2>The init function +<p> + +<tt>init</> is called from <tt>main</> upon startup. It initializes the +following tasks / subsystems: +<enum> +<item> Processing of the <tt>user configuration file</>, by using the + <tt>process_configuration_file</> function. Failing to complete the + configuration file processing is considered a <tt>fatal error</>, + and EXT2ED is aborted. I did it this way because the configuration + file has some sensitive user options like write access behavior, and + I wanted to be sure that the user is aware of them. +<item> Registration of the <tt>general commands</> through the use of + the <tt>add_general_commands</> function. +<item> Reset of the object memory rotating lifo structure. +<item> Reset of the device parameters and of the current type. +<item> Initialization of the windows subsystem - The interface between the + ncurses library and EXT2ED, through the use of the <tt>init_windows</> + function, available in <tt>win.c</>. +<item> Initialization of the interface between the readline library and + EXT2ED, through <tt>init_readline</>. +<item> Initialization of the <tt>signals</> subsystem, through + <tt>init_signals</>. +<item> Disabling write access. Write access needs to be explicitly enabled + using a user command, to prevent accidental user mistakes. +</enum> +When <tt>init</> is finished, it dispatches the <tt>help</> command in order +to show the available commands to the user. Note that the ext2 layer is still +not added; It will be added if and when EXT2ED will detect an ext2 +filesystem on a device. + +<sect2>The prepare_to_close function +<p> + +The <tt>prepare_to_close</> function reverses some of the actions done +earlier in EXT2ED and freeing the dynamically allocated memory. +Specifically, it: +<enum> +<item> Closes the open device, if any. +<item> Removes the first level - Removing the general commands, through + the use of <tt>free_user_commands</>, with a pointer to the + general_commands structure as a parameter. +<item> Removes of the second level - Removing the ext2 ext2 general + commands, in much the same way. +<item> Removes of the third level - Removing the objects and the object + specific commands, by using <tt>free_struct_descriptors</>. +<item> Closes the window subsystem, and deattaches EXT2ED from the ncurses + library, through the use of the <tt>close_windows</> function, + available in <tt>win.c</>. +</enum> + +<sect1> Registration of commands +<p> + +Addition of a user command is done through the <tt>add_user_command</> +function. The prototype is: +<tscreen><verb> +void add_user_command (struct struct_commands *ptr,char *name,char +*description,PF callback); +</verb></tscreen> +The function receives a pointer to a structure of type +<tt>struct_commands</>, a desired name for the command which will be used by +the user to identify the command, a short description which is utilized by the +<tt>help</> subsystem, and a pointer to a C function which will be called if +<tt>dispatch</> decides that this command was requested. + +The <tt>add_user_command</> is a <tt>low level function</> used in the three +levels to add user commands. For example, addition of the <tt>ext2 +general commands is done by:</> +<tscreen><code> +void add_ext2_general_commands (void) + +{ + add_user_command (&ero;ext2_commands,"super","Moves to the superblock of the filesystem",type_ext2___super); + add_user_command (&ero;ext2_commands,"group","Moves to the first group descriptor",type_ext2___group); + add_user_command (&ero;ext2_commands,"cd","Moves to the directory specified",type_ext2___cd); +} +</code></tscreen> + +<sect1>Registration of objects +<p> + +Registration of objects is based, as explained earlier, on the "compilation" +of an external user file, which has a syntax similar to the C language +<tt>struct</> keyword. The primitive parser I have implemented detects the +definition of structures, and calls some lower level functions to actually +register the new detected object. The parser's prototype is: +<tscreen><verb> +int set_struct_descriptors (char *file_name) +</verb></tscreen> +It opens the given file name, and calls, when appropriate: +<itemize> +<item> add_new_descriptor +<item> add_new_variable +</itemize> +<tt>add_new_descriptor</> is a low level function which adds a new descriptor +to the doubly linked list of the available objects. It will then call +<tt>fill_type_commands</>, which will add specific commands to the object, +if the object is known. + +<tt>add_new_variable</> will add a new variable of the requested length to the +specified descriptor. + +<sect1>Initialization upon specification of a device +<p> + +When the general command <tt>setdevice</> is used to open a device, some +initialization sequence takes place, which is intended to determine two +factors: +<itemize> +<item> Are we dealing with an ext2 filesystem ? +<item> What are the basic filesystem parameters, such as its total size and + its block size ? +</itemize> +This questions are answered by the <tt>set_file_system_info</>, possibly +using some <tt>help from the user</>, through the configuration file. +The answers are placed in the <tt>file_system_info</> structure, which is of +type <tt>struct_file_system_info</>: +<tscreen><code> +struct struct_file_system_info { + unsigned long file_system_size; + unsigned long super_block_offset; + unsigned long first_group_desc_offset; + unsigned long groups_count; + unsigned long inodes_per_block; + unsigned long blocks_per_group; /* The name is misleading; beware */ + unsigned long no_blocks_in_group; + unsigned short block_size; + struct ext2_super_block super_block; +}; +</code></tscreen> + +Autodetection of an ext2 filesystem is usually recommended. However, on a damaged +filesystem I can't assure a success. That's were the user comes in - He can +<tt>override</> the auto detection procedure and force an ext2 filesystem, by +selecting the proper options in the configuration file. + +If auto detection succeeds, the second question above is automatically +answered - I get all the information I need from the filesystem itself. In +any case, default parameters can be supplied in the configuration file and +the user can select the required behavior. + +If we decide to treat the filesystem as an ext2 filesystem, <tt>registration of +the ext2 specific objects</> is done at this point, by calling the +<tt>set_struct_descriptors</> outlined earlier, with the name of the file +which describes the ext2 objects, and is basically based on the ext2 sources +main include file. At this point, EXT2ED can be fully used by the user. + +If we do not register the ext2 specific objects, the user can still provide +object definitions in a separate file, and will be able to use EXT2ED in a +<tt>limited form</>, but more sophisticated than a simple hex editor. + +<sect>main.c +<p> + +As described earlier, <tt>main.c</> is used as a front-head to the entire +program. <tt>main.c</> contains the following elements: + +<sect1>The main routine +<p> + +The <tt>main</> routine was displayed above. Its task is to pass control to +the initialization routines and to the parser. + +<sect1>The parser +<p> + +The parser consists of the following functions: +<itemize> +<item> The <tt>parser</> function, which reads the command line from the + user and saves it in readline's history buffer and in the internal + last-command buffer. +<item> The <tt>parse_word</> function, which receives a string and parses + the first word from it, ignoring whitespaces, and returns a pointer + to the rest of the string. +<item> The <tt>complete_command</> function, which is used by the readline + library for command completion. It scans the available commands at + this point and determines the possible completions. +</itemize> + +<sect1>The dispatcher +<p> + +The dispatcher was already explained in the flow control section - section +<ref id="flow_control">. Its task is to pass control to the proper command +handling function, based on the command line's command. + +<sect1>The self-sanity control +<p> + +This is not fully implemented. + +The general idea was to provide a control system which will supervise the +internal work of EXT2ED. Since I am pretty sure that bugs exist, I have +double checked myself in a few instances, and issued an <tt>internal +error</> warning if I reached the conclusion that something is not logical. +The internal error is reported by the function <tt>internal_error</>, +available in <tt>main.c</>. + +The self sanity check is compiled only if the compile time option +<tt>DEBUG</> is selected. + +<sect>The windows interface +<p> + +Screen handling and interfacing to the <tt>ncurses</> library is done in +<tt>win.c</>. + +<sect1>Initialization +<p> + +Opening of the windows is done in <tt>init_windows</>. In +<tt>close_windows</>, we just close our windows. The various window lengths +with an exception to the <tt>show pad</> are defined in the main header file. +The rest of the display will be used by the <tt>show pad</>. + +<sect1>Display output +<p> + +Each actual refreshing of the terminal monitor is done by using the +appropriate refresh function from this file: <tt>refresh_title_win</>, +<tt>refresh_show_win</>, <tt>refresh_show_pad</> and +<tt>refresh_command_win</>. + +With the exception of the <tt>show pad</>, each function simply calls the +<tt>ncurses refresh command</>. In order to provide to <tt>scrolling</> in +the <tt>show pad</>, some information about its status is constantly updated +by the various functions which display output in it. <tt>refresh_show_pad</> +passes this information to <tt>ncurses</> so that the correct part of the pad +is actually copied to the display. + +The above information is saved in a global variable of type <tt>struct +struct_pad_info</>: + +<tscreen><code> +struct struct_pad_info { + int display_lines,display_cols; + int line,col; + int max_line,max_col; + int disable_output; +}; +</code></tscreen> + +<sect1>Screen redraw +<p> + +The <tt>redraw_all</> function will just reopen the windows. This action is +necessary if the display gets garbled from some reason. + +<sect>The disk interface +<p> + +All the disk activity with regard to the filesystem passes through the file +<tt>disk.c</>. This is done that way to provide additional levels of safety +concerning the disk access. This way, global decisions considering the disk +can be easily accomplished. The benefits of this isolation will become even +clearer in the next sections. + +<sect1>Low level functions +<p> + +Read requests are ultimately handled by <tt>low_read</> and write requests +are handled by <tt>low_write</>. They just receive the length of the data +block, the offset in the filesystem and a pointer to the buffer and pass the +request to the <tt>fread</> or <tt>fwrite</> standard library functions. + +<sect1>Mounted filesystems +<p> + +EXT2ED design assumes that the edited filesystem is not mounted. Even if +a <tt>reasonably simple</> way to handle mounted filesystems exists, it is +probably <tt>too complicated</> :-) + +Write access to a mounted filesystem will be denied. Read access can be +allowed by using a configuration file option. The mount status is determined +by reading the file /etc/mtab. + +<sect1>Write access +<p> + +Write access is the most sensitive part in the program. This program is +intended for <tt>editing filesystems</>. It is obvious that a small mistake +in this regard can make the filesystem not usable anymore. + +The following safety measures are added, of-course, to the general Unix +permission protection - The user can always disable write access on the +device file itself. + +Considering the user, the following safety measures were taken: +<enum> +<item> The filesystem is <tt>never</> opened with write-access enables. + Rather, the user must explicitly request to enable write-access. +<item> The user can <tt>disable</> write access entirely by using a + <tt>configuration file option</>. +<item> Changes are never done automatically - Whenever the user makes + changes, they are done in memory. An explicit <tt>writedata</> + command should be issued to make the changes active in the disk. +</enum> +Considering myself, I tried to protect against my bugs by: +<itemize> +<item> Opening the device in read-only mode until a write request is + issued by the user. +<item> Limiting <tt>actual</> filesystem access to two functions only - + <tt>low_read</> for reading, and <tt>low_write</> for writing. Those + functions were programmed carefully, and I added the self + sanity checks there. In addition, this is the only place in which I + need to check the user options described above - There can be no + place in which I can "forget" to check them. + + Note that The disabling of write-access through the configuration file + is double checked here only as a <tt>self-sanity</> check - If + <tt>DEBUG</> is selected, since write enable should have been refused + and write-access is always disabled at startup, hence finding + <tt>here</> that the user has write access disabled through the + configuration file clearly indicates that I have a bug somewhere. +</itemize> + +The following safety measure can provide protection against <tt>both</> user +mistakes and my own bugs: +<itemize> +<item> I added a <tt>logging option</>, which logs every actual write + access to the disk in the lowest level - In <tt>low_write</> itself. + + The logging has nothing to do with the current type and the various + other higher level operations of EXT2ED - It is simply a hex dump of + the contents which will be overwritten; Both the original contents + and the new written data. + + In that case, even if the user makes a mistake, the original data + can be retrieved. + + Even If I have a bug somewhere which causes incorrect data to be + written to the disk, the logging option will still log exactly the + original contents at the place were data was incorrectly overwritten. + (This assumes, of-course, that <tt>low-write</> and the <tt>logging + itself</> work correctly. I have done my best to verify that this is + indeed the case). + + The <tt>logging</> option is implemented in the <tt>log_changes</> + function. +</itemize> + +<sect1>Reading / Writing objects +<p> + +Usually <tt>(not always)</>, the current object data is available in the +global variable <tt>type_data</>, which is of the type: +<tscreen><code> +struct struct_type_data { + long offset_in_block; + + union union_type_data { + char buffer [EXT2_MAX_BLOCK_SIZE]; + struct ext2_acl_header t_ext2_acl_header; + struct ext2_acl_entry t_ext2_acl_entry; + struct ext2_old_group_desc t_ext2_old_group_desc; + struct ext2_group_desc t_ext2_group_desc; + struct ext2_inode t_ext2_inode; + struct ext2_super_block t_ext2_super_block; + struct ext2_dir_entry t_ext2_dir_entry; + } u; +}; +</code></tscreen> +The above union enables me, in the program, to treat the data as raw data or +as a meaningful filesystem object. + +The reading and writing, if done to this global variable, are done through +the functions <tt>load_type_data</> and <tt>write_type_data</>, available in +<tt>disk.c</>. + +<sect>The general commands +<p> + +The <tt>general commands</> are handled in the file <tt>general_com.c</>. + +<sect1>The help system +<p> + +The help command is handled by the function <tt>help</>. The algorithm is as +follows: + +<enum> +<item> Check the command line arguments. If there is an argument, pass + control to the <tt>detailed_help</> function, in order to provide + help on the specific command. +<item> If general help was requested, display a list of the available + commands at this point. The three levels are displayed in reverse + order - First the commands which are specific to the current type + (If a current type is defined), then the ext2 general commands (If + we decided that the filesystem should be treated like an ext2 + filesystem), then the general commands. +<item> Display information about EXT2ED - Current version, general + information about the project, etc. +</enum> + +<sect1>The setdevice command +<p> + +The <tt>setdevice</> commands result in calling the <tt>set_device</> +function. The algorithm is: + +<enum> +<item> Parse the command line argument. If it isn't available report the + error and return. +<item> Close the current open device, if there is one. +<item> Open the new device in read-only mode. Update the global variables + <tt>device_name</> and <tt>device_handle</>. +<item> Disable write access. +<item> Empty the object memory. +<item> Unregister the ext2 general commands, using + <tt>free_user_commands</>. +<item> Unregister the current objects, using <tt>free_struct_descriptors</> +<item> Call <tt>set_file_system_info</> to auto-detect an ext2 filesystem + and set the basic filesystem values. +<item> Add the <tt>alternate descriptors</>, supplied by the user. +<item> Set the device offset to the filesystem start by dispatching + <tt>setoffset 0</>. +<item> Show the new available commands by dispatching the <tt>help</> + command. +</enum> + +<sect1>Basic maneuvering +<p> + +Basic maneuvering is done using the <tt>setoffset</> and the <tt>settype</> +user commands. + +<tt>set_offset</> accepts some alternative forms of specifying the new +offset. They all ultimately lead to changing the <tt>device_offset</> +global variable and seeking to the new position. <tt>set_offset</> also +calls <tt>load_type_data</> to read a block ahead of the new position into +the <tt>type_data</> global variable. + +<tt>set_type</> will point the global variable <tt>current_type</> to the +correct entry in the double linked list of the known objects. If the +requested type is <tt>hex</> or <tt>none</>, <tt>current_type</> will be +initialized to <tt>NULL</>. <tt>set_type</> will also dispatch <tt>show</>, +so that the object data will be re-formatted in the new format. + +When editing an ext2 filesystem, it is not intended that those commands will +be used directly, and it is usually not required. My implementation of the +ext2 layer, on the other hand, uses this lower level commands on countless +occasions. + +<sect1>The display functions +<p> + +The general command version of <tt>show</> is handled by the <tt>show</> +function. This command is overridden by various objects to provide a display +which is better suited to the object. + +The general show command will format the data in <tt>type_data</> according +to the structure definition of the current type and show it on the <tt>show +pad</>. If there is no current type, the data will be shown as a simple hex +dump; Otherwise, the list of variables, along with their values will be shown. + +A call to <tt>show_info</> is also made - <tt>show_info</> will provide +<tt>general statistics</> on the <tt>show_window</>, such as the current +block, current type, current offset and current page. + +The <tt>pgup</> and <tt>pgdn</> general commands just update the +<tt>show_pad_info</> global variable - We just increment +<tt>show_pad_info.line</> with the number of lines in the screen - +<tt>show_pad_info.display_lines</>, which was initialized in +<tt>init_windows</>. + +<sect1>Changing data +<p> + +Data change is done in memory only. An update to the disk if followed by an +explicit <tt>writedata</> command to the disk. The <tt>write_data</> +function simple calls the <tt>write_type_data</> function, outlined earlier. + +The <tt>set</> command is used for changing the data. + +If there is no current type, control is passed to the <tt>hex_set</> function, +which treats the data as a block of bytes and uses the +<tt>type_data.offset_in_block</> variable to write the new text or hex string +to the correct place in the block. + +If a current type is defined, the requested variable is searched in the +current object, and the desired new valued is entered. + +The <tt>enablewrite</> commands just sets the global variable +<tt>write_access</> to <tt>1</> and re-opens the filesystem in read-write +mode, if possible. + +If the current type is NULL, a hex-mode is assumed - The <tt>next</> and +<tt>prev</> commands will just update <tt>type_data.offset_in_block</>. + +If the current type is not NULL, the The <tt>next</> and <tt>prev</> command +are usually overridden anyway. If they are not overridden, it will be assumed +that the user is editing an array of such objects, and they will just pass +to the next / prev element by dispatching to <tt>setoffset</> using the +<tt>setoffset type + / - X</> syntax. + +<sect>The ext2 general commands +<p> + +The ext2 general commands are contained in the <tt>ext2_general_commands</> +global variable (which is of type <tt>struct struct_commands</>). + +The handling functions are implemented in the source file <tt>ext2_com.c</>. +I will include the entire source code since it is relatively short. + +<sect1>The super command +<p> + +The super command just "brings the user" to the main superblock and set the +type to ext2_super_block. The implementation is trivial: + +<tscreen><code> +void type_ext2___super (char *command_line) + +{ + char buffer [80]; + + super_info.copy_num=0; + sprintf (buffer,"setoffset %ld",file_system_info.super_block_offset);dispatch (buffer); + sprintf (buffer,"settype ext2_super_block");dispatch (buffer); +} +</code></tscreen> +It involves only setting the <tt>copy_num</> variable to indicate the main +copy, dispatching a <tt>setoffset</> command to reach the superblock, and +dispatching a <tt>settype</> to enable the superblock specific commands. +This last command will also call the <tt>show</> command of the +<tt>ext2_super_block</> type, through dispatching at the general command +<tt>settype</>. + +<sect1>The group command +<p> + +The group command will bring the user to the specified group descriptor in +the main copy of the group descriptors. The type will be set to +<tt>ext2_group_desc</>: +<tscreen><code> +void type_ext2___group (char *command_line) + +{ + long group_num=0; + char *ptr,buffer [80]; + + ptr=parse_word (command_line,buffer); + if (*ptr!=0) { + ptr=parse_word (ptr,buffer); + group_num=atol (buffer); + } + + group_info.copy_num=0;group_info.group_num=0; + sprintf (buffer,"setoffset %ld",file_system_info.first_group_desc_offset);dispatch (buffer); + sprintf (buffer,"settype ext2_group_desc");dispatch (buffer); + sprintf (buffer,"entry %ld",group_num);dispatch (buffer); +} +</code></tscreen> +The implementation is as trivial as the <tt>super</> implementation. Note +the use of the <tt>entry</> command, which is a command of the +<tt>ext2_group_desc</> object, to pass to the correct group descriptor. + +<sect1>The cd command +<p> + +The <tt>cd</> command performs the usual cd function. The path to the global +cd command is a path from <tt>/</>. + +<tt>This is one of the best examples of the power of the object oriented +design and of the dispatching mechanism. The operation is complicated, yet the +implementation is surprisingly short !</> + +<tscreen><code> +void type_ext2___cd (char *command_line) + +{ + char temp [80],buffer [80],*ptr; + + ptr=parse_word (command_line,buffer); + if (*ptr==0) { + wprintw (command_win,"Error - No argument specified\n"); + refresh_command_win ();return; + } + ptr=parse_word (ptr,buffer); + + if (buffer [0] != '/') { + wprintw (command_win,"Error - Use a full pathname (begin with '/')\n"); + refresh_command_win ();return; + } + + dispatch ("super");dispatch ("group");dispatch ("inode"); + dispatch ("next");dispatch ("dir"); + if (buffer [1] != 0) { + sprintf (temp,"cd %s",buffer+1);dispatch (temp); + } +} +</code></tscreen> + +Note the number of the dispatch calls ! + +<tt>super</> is used to get to the superblock. <tt>group</> to get to the +first group descriptor. <tt>inode</> brings us to the first inode - The bad +blocks inode. A <tt>next</> is command to pass to the root directory inode, +a <tt>dir</> command "enters" the directory, and then we let the <tt>object +specific cd command</> to take us from there (The object is <tt>dir</>, so +that <tt>dispatch</> will call the <tt>cd</> command of the <tt>dir</> type). +Note that a symbolic link following could bring us back to the root directory, +thus the innocent calls above treats nicely such a recursive case ! + +I feel that the above is <tt>intuitive</> - I was expressing myself "in the +language" of the ext2 filesystem - (Go to the inode, etc), and the code was +written exactly in this spirit ! + +I can write more at this point, but I guess I am already a bit carried +away with the self compliments :-) + +<sect>The superblock +<p> + +This section details the handling of the superblock. + +<sect1>The superblock variables +<p> + +The superblock object is <tt>ext2_super_block</>. The definition is just +taken from the kernel ext2 main include file - /usr/include/linux/ext2_fs.h. +<footnote> +Those lines of source are copyrighted by <tt>Remy Card</> - The author of the +ext2 filesystem, and by <tt>Linus Torvalds</> - The first author of the Linux +operating system. Please cross reference the section Acknowledgments for the +full copyright. +</footnote> +<tscreen><code> +struct ext2_super_block { + __u32 s_inodes_count; /* Inodes count */ + __u32 s_blocks_count; /* Blocks count */ + __u32 s_r_blocks_count; /* Reserved blocks count */ + __u32 s_free_blocks_count; /* Free blocks count */ + __u32 s_free_inodes_count; /* Free inodes count */ + __u32 s_first_data_block; /* First Data Block */ + __u32 s_log_block_size; /* Block size */ + __s32 s_log_frag_size; /* Fragment size */ + __u32 s_blocks_per_group; /* # Blocks per group */ + __u32 s_frags_per_group; /* # Fragments per group */ + __u32 s_inodes_per_group; /* # Inodes per group */ + __u32 s_mtime; /* Mount time */ + __u32 s_wtime; /* Write time */ + __u16 s_mnt_count; /* Mount count */ + __s16 s_max_mnt_count; /* Maximal mount count */ + __u16 s_magic; /* Magic signature */ + __u16 s_state; /* File system state */ + __u16 s_errors; /* Behavior when detecting errors */ + __u16 s_pad; + __u32 s_lastcheck; /* time of last check */ + __u32 s_checkinterval; /* max. time between checks */ + __u32 s_creator_os; /* OS */ + __u32 s_rev_level; /* Revision level */ + __u16 s_def_resuid; /* Default uid for reserved blocks */ + __u16 s_def_resgid; /* Default gid for reserved blocks */ + __u32 s_reserved[0]; /* Padding to the end of the block */ + __u32 s_reserved[1]; /* Padding to the end of the block */ + . + . + . + __u32 s_reserved[234]; /* Padding to the end of the block */ +}; +</code></tscreen> + +Note that I <tt>expanded</> the array due to my primitive parser +implementation. The various fields are described in the <tt>technical +document</>. + +<sect1>The superblock commands +<p> + +This section explains the commands available in the <tt>ext2_super_block</> +type. They all appear in <tt>super_com.c</> + +<sect2>The show command +<p> + +The <tt>show</> command is overridden here in order to provide more +information than just the list of variables. A <tt>show</> command will end +up in calling <tt>type_super_block___show</>. + +The first thing that we do is calling the <tt>general show command</> in +order to display the list of variables. + +We then add some interpretation to the various lines to make the data +somewhat more intuitive (Expansion of the time variables and the creator +operating system code, for example). + +We also display the <tt>backup copy number</> of the superblock in the status +window. This copy number is saved in the <tt>super_info</> global variable - +<tt>super_info.copy_num</>. Currently, this is the only variable there ... +but this type of internal variable saving is typical through my +implementation. + +<sect2>The backup copies handling commands +<p> + +The <tt>current copy number</> is available in <tt>super_info.copy_num</>. It +was initialized in the ext2 command <tt>super</>, and is used by the various +superblock routines. + +The <tt>gocopy</> routine will pass to another copy of the superblock. The +new device offset will be computed with the aid of the variables in the +<tt>file_system_info</> structure. Then the routine will <tt>dispatch</> to +the <tt>setoffset</> and the <tt>show</> routines. + +The <tt>setactivecopy</> routine will just save the current superblock data +in a temporary variable of type <tt>ext2_super_block</>, and will dispatch +<tt>gocopy 0</> to pass to the main superblock. Then it will place the saved +data in place of the actual data. + +The above two commands can be used if the main superblock is corrupted. + +<sect>The group descriptors +<p> + +The group descriptors handling mechanism allows the user to take a tour in +the group descriptors table, stopping at each point, and examining the +relevant inode table, block allocation map or inode allocation map through +dispatching to the relevant objects. + +Some information about the group descriptors is available in the global +variable <tt>group_info</>, which is of type <tt>struct_group_info</>: + +<tscreen><code> +struct struct_group_info { + unsigned long copy_num; + unsigned long group_num; +}; +</code></tscreen> + +<tt>group_num</> is the index of the current descriptor in the table. + +<tt>copy_num</> is the number of the current backup copy. + +<sect1>The group descriptor's variables +<p> + +<tscreen><code> +struct ext2_group_desc +{ + __u32 bg_block_bitmap; /* Blocks bitmap block */ + __u32 bg_inode_bitmap; /* Inodes bitmap block */ + __u32 bg_inode_table; /* Inodes table block */ + __u16 bg_free_blocks_count; /* Free blocks count */ + __u16 bg_free_inodes_count; /* Free inodes count */ + __u16 bg_used_dirs_count; /* Directories count */ + __u16 bg_pad; + __u32 bg_reserved[3]; +}; +</code></tscreen> + +The first three variables are used to provide the links to the +<tt>blockbitmap, inodebitmap and inode</> objects. + +<sect1>Movement in the table +<p> + +Movement in the group descriptors table is done using the <tt>next, prev and +entry</> commands. Note that the first two commands <tt>override</> the +general commands of the same name. The <tt>next and prev</> command are just +calling the <tt>entry</> function to do the job. I will show <tt>next</>, +for example: + +<tscreen><code> +void type_ext2_group_desc___next (char *command_line) + +{ + long entry_offset=1; + char *ptr,buffer [80]; + + ptr=parse_word (command_line,buffer); + if (*ptr!=0) { + ptr=parse_word (ptr,buffer); + entry_offset=atol (buffer); + } + + sprintf (buffer,"entry %ld",group_info.group_num+entry_offset); + dispatch (buffer); +} +</code></tscreen> +The <tt>entry</> function is also simple - It just calculates the offset +using the information in <tt>group_info</> and in <tt>file_system_info</>, +and uses the usual <tt>setoffset / show</> pair. + +<sect1>The show command +<p> + +As usual, the <tt>show</> command is overridden. The implementation is +similar to the superblock's show implementation - We just call the general +show command, and add some information in the status window - The contents of +the <tt>group_info</> structure. + +<sect1>Moving between backup copies +<p> + +This is done exactly like the superblock case. Please refer to explanation +there. + +<sect1>Links to the available friends +<p> + +From a group descriptor, one typically wants to reach an <tt>inode</>, or +one of the <tt>allocation bitmaps</>. This is done using the <tt>inode, +blockbitmap or inodebitmap</> commands. The implementation is again trivial +- Get the necessary information from the group descriptor, initialize the +structures of the next type, and issue the <tt>setoffset / settype</> pair. + +For example, here is the implementation of the <tt>blockbitmap</> command: + +<tscreen><code> +void type_ext2_group_desc___blockbitmap (char *command_line) + +{ + long block_bitmap_offset; + char buffer [80]; + + block_bitmap_info.entry_num=0; + block_bitmap_info.group_num=group_info.group_num; + + block_bitmap_offset=type_data.u.t_ext2_group_desc.bg_block_bitmap; + sprintf (buffer,"setoffset block %ld",block_bitmap_offset);dispatch (buffer); + sprintf (buffer,"settype block_bitmap");dispatch (buffer); +} +</code></tscreen> + +<sect>The inode table +<p> + +The inode handling enables the user to move in the inode table, edit the +various attributes of the inode, and follow to the next stage - A file or a +directory. + +<sect1>The inode variables +<p> + +<tscreen><code> +struct ext2_inode { + __u16 i_mode; /* File mode */ + __u16 i_uid; /* Owner Uid */ + __u32 i_size; /* Size in bytes */ + __u32 i_atime; /* Access time */ + __u32 i_ctime; /* Creation time */ + __u32 i_mtime; /* Modification time */ + __u32 i_dtime; /* Deletion Time */ + __u16 i_gid; /* Group Id */ + __u16 i_links_count; /* Links count */ + __u32 i_blocks; /* Blocks count */ + __u32 i_flags; /* File flags */ + union { + struct { + __u32 l_i_reserved1; + } linux1; + struct { + __u32 h_i_translator; + } hurd1; + struct { + __u32 m_i_reserved1; + } masix1; + } osd1; /* OS dependent 1 */ + __u32 i_block[EXT2_N_BLOCKS]; /* Pointers to blocks */ + __u32 i_version; /* File version (for NFS) */ + __u32 i_file_acl; /* File ACL */ + __u32 i_dir_acl; /* Directory ACL */ + __u32 i_faddr; /* Fragment address */ + union { + struct { + __u8 l_i_frag; /* Fragment number */ + __u8 l_i_fsize; /* Fragment size */ + __u16 i_pad1; + __u32 l_i_reserved2[2]; + } linux2; + struct { + __u8 h_i_frag; /* Fragment number */ + __u8 h_i_fsize; /* Fragment size */ + __u16 h_i_mode_high; + __u16 h_i_uid_high; + __u16 h_i_gid_high; + __u32 h_i_author; + } hurd2; + struct { + __u8 m_i_frag; /* Fragment number */ + __u8 m_i_fsize; /* Fragment size */ + __u16 m_pad1; + __u32 m_i_reserved2[2]; + } masix2; + } osd2; /* OS dependent 2 */ +}; +</code></tscreen> + +The above is the original source code definition. We can see that the inode +supports <tt>Operating systems specific structures</>. In addition to the +expansion of the arrays, I have <tt>"flattened</> the inode to support only +the <tt>Linux</> declaration. It seemed that this one occasion of multiple +variable aliases didn't justify the complication of generally supporting +aliases. In any case, the above system specific variables are not used +internally by EXT2ED, and the user is free to change the definition in +<tt>ext2.descriptors</> to accommodate for his needs. + +<sect1>The handling functions +<p> + +The user interface to <tt>movement</> is the usual <tt>next / prev / +entry</> interface. There is really nothing special in those functions - The +size of the inode is fixed, the total number of inodes is known from the +superblock information, and the current entry can be figured up from the +device offset and the inode table start offset, which is known from the +corresponding group descriptor. Those functions are a bit older then some +other implementations of <tt>next</> and <tt>prev</>, and they do not save +information in a special structure. Rather, they recompute it when +necessary. + +The <tt>show</> command is overridden here, and provides a lot of additional +information about the inode - Its type, interpretation of the permissions, +special ext2 attributes (Immutable file, for example), and a lot more. +Again, the <tt>general show</> is called first, and then the additional +information is written. + +<sect1>Accessing files and directories +<p> + +From the inode, a <tt>file</> or a <tt>directory</> can typically be reached. +In order to treat a file, for example, its inode needs to be constantly +accessed. To satisfy that need, when editing a file or a directory, the +inode is still saved in memory - <tt>type_data</> is not overwritten. +Rather, the following takes place: +<itemize> +<item> An internal global structure which is used by the types <tt>file</> + and <tt>dir</> handling functions is initialized by calling the + appropriate function. +<item> The type is changed accordingly. +</itemize> +The result is that a <tt>settype ext2_inode</> is the only action necessary +to return to the inode - We actually never left it. + +Follows the implementation of the inode's <tt>file</> command: + +<tscreen><code> +void type_ext2_inode___file (char *command_line) + +{ + char buffer [80]; + + if (!S_ISREG (type_data.u.t_ext2_inode.i_mode)) { + wprintw (command_win,"Error - Inode type is not file\n"); + refresh_command_win (); return; + } + + if (!init_file_info ()) { + wprintw (command_win,"Error - Unable to show file\n"); + refresh_command_win ();return; + } + + sprintf (buffer,"settype file");dispatch (buffer); +} +</code></tscreen> + +As we can see - We just call <tt>init_file_info</> to get the necessary +information from the inode, and set the type to <tt>file</>. The next call +to <tt>show</>, will dispatch to the <tt>file's show</> implementation. + +<sect>Viewing a file +<p> + +There isn't an ext2 kernel structure which corresponds to a file - A file is +just a series of blocks which are determined by its inode. As explained in +the last section, the inode is never actually left - The type is changed to +<tt>file</> - A type which contains no variables, and a special structure is +initialized: + +<tscreen><code> +struct struct_file_info { + + struct ext2_inodes *inode_ptr; + + long inode_offset; + long global_block_num,global_block_offset; + long block_num,blocks_count; + long file_offset,file_length; + long level; + unsigned char buffer [EXT2_MAX_BLOCK_SIZE]; + long offset_in_block; + + int display; + /* The following is used if the file is a directory */ + + long dir_entry_num,dir_entries_count; + long dir_entry_offset; +}; +</code></tscreen> + +The <tt>inode_ptr</> will just point to the inode in <tt>type_data</>, which +is not overwritten while the user is editing the file, as the +<tt>setoffset</> command is not internally used. The <tt>buffer</> +will contain the current viewed block of the file. The other variables +contain information about the current place in the file. For example, +<tt>global_block_num</> just contains the current block number. + +The general idea is that the above data structure will provide the file +handling functions all the accurate information which is needed to accomplish +their task. + +The global structure of the above type, <tt>file_info</>, is initialized by +<tt>init_file_info</> in <tt>file_com.c</>, which is called by the +<tt>type_ext2_inode___file</> function when the user requests to watch the +file. <tt>It is updated as necessary to provide accurate information as long as +the file is edited.</> + +<sect1>Returning to the file's inode +<p> + +Concerning the method I used to handle files, the above task is trivial: +<tscreen><code> +void type_file___inode (char *command_line) + +{ + dispatch ("settype ext2_inode"); +} +</code></tscreen> + +<sect1>File movement +<p> + +EXT2ED keeps track of the current position in the file. Movement inside the +current block is done using <tt>next, prev and offset</> - They just change +<tt>file_info.offset_in_block</>. + +Movement between blocks is done using <tt>nextblock, prevblock and block</>. +To accomplish this, the direct blocks, indirect blocks, etc, need to be +traced. This is done by <tt>file_block_to_global_block</>, which accepts a +file's internal block number, and converts it to the actual filesystem block +number. + +<tscreen><code> +long file_block_to_global_block (long file_block,struct struct_file_info *file_info_ptr) + +{ + long last_direct,last_indirect,last_dindirect; + long f_indirect,s_indirect; + + last_direct=EXT2_NDIR_BLOCKS-1; + last_indirect=last_direct+file_system_info.block_size/4; + last_dindirect=last_indirect+(file_system_info.block_size/4) \ + *(file_system_info.block_size/4); + + if (file_block <= last_direct) { + file_info_ptr->level=0; + return (file_info_ptr->inode_ptr->i_block [file_block]); + } + + if (file_block <= last_indirect) { + file_info_ptr->level=1; + file_block=file_block-last_direct-1; + return (return_indirect (file_info_ptr->inode_ptr-> \ + i_block [EXT2_IND_BLOCK],file_block)); + } + + if (file_block <= last_dindirect) { + file_info_ptr->level=2; + file_block=file_block-last_indirect-1; + return (return_dindirect (file_info_ptr->inode_ptr-> \ + i_block [EXT2_DIND_BLOCK],file_block)); + } + + file_info_ptr->level=3; + file_block=file_block-last_dindirect-1; + return (return_tindirect (file_info_ptr->inode_ptr-> \ + i_block [EXT2_TIND_BLOCK],file_block)); +} +</code></tscreen> +<tt>last_direct, last_indirect, etc</>, contain the last internal block number +which is accessed by this method - If the requested block is smaller then +<tt>last_direct</>, for example, it is a direct block. + +If the block is a direct block, its number is just taken from the inode. +A non-direct block is handled by <tt>return_indirect, return_dindirect and +return_tindirect</>, which correspond to indirect, double-indirect and +triple-indirect. Each of the above functions is constructed using the lower +level functions. For example, <tt>return_dindirect</> is constructed as +follows: + +<tscreen><code> +long return_dindirect (long table_block,long block_num) + +{ + long f_indirect; + + f_indirect=block_num/(file_system_info.block_size/4); + f_indirect=return_indirect (table_block,f_indirect); + return (return_indirect (f_indirect,block_num%(file_system_info.block_size/4))); +} +</code></tscreen> + +<sect1>Object memory +<p> + +The <tt>remember</> command is overridden here and in the <tt>dir</> type - +We just remember the inode of the file. It is just simpler to implement, and +doesn't seem like a big limitation. + +<sect1>Changing data +<p> + +The <tt>set</> command is overridden, and provides the same functionality +like the usage of the <tt>general set</> command with no type declared. The +<tt>writedata</> is overridden so that we'll write the edited block +(file_info.buffer) and not <tt>type_data</> (Which contains the inode). + +<sect>Directories +<p> + +A directory is just a file which is formatted according to a special format. +As such, EXT2ED handles directories and files quite alike. Specifically, the +same variable of type <tt>struct_file_info</> which is used in the +<tt>file</>, is used here. + +The <tt>dir</> type uses all the variables in the above structure, as +opposed to the <tt>file</> type, which didn't use the last ones. + +<sect1>The search_dir_entries function +<p> + +The entire situation is similar to that which was described in the +<tt>file</> type, with one main change: + +The main function in <tt>dir_com.c</> is <tt>search_dir_entries</>. This +function will <tt>"run"</> on the entire entries in the directory, and will +call a client's function each time. The client's function is supplied as an +argument, and will check the current entry for a match, based on its own +criterion. It will then signal <tt>search_dir_entries</> whether to +<tt>ABORT</> the search, whether it <tt>FOUND</> the entry it was looking +for, or that the entry is still not found, and we should <tt>CONTINUE</> +searching. Follows the declaration: +<tscreen><code> +struct struct_file_info search_dir_entries \ + (int (*action) (struct struct_file_info *info),int *status) + +/* + This routine runs on all directory entries in the current directory. + For each entry, action is called. The return code of action is one of + the following: + + ABORT - Current dir entry is returned. + CONTINUE - Continue searching. + FOUND - Current dir entry is returned. + + If the last entry is reached, it is returned, along with an ABORT status. + + status is updated to the returned code of action. +*/ +</code></tscreen> + +With the above tool in hand, many operations are simple to perform - Here is +the way I counted the entries in the current directory: + +<tscreen><code> +long count_dir_entries (void) + +{ + int status; + + return (search_dir_entries (&ero;action_count,&ero;status).dir_entry_num); +} + +int action_count (struct struct_file_info *info) + +{ + return (CONTINUE); +} +</code></tscreen> +It will just <tt>CONTINUE</> until the last entry. The returned structure +(of type <tt>struct_file_info</>) will have its number in the +<tt>dir_entry_num</> field, and this is exactly the required number ! + +<sect1>The cd command +<p> + +The <tt>cd</> command accepts a relative path, and moves there ... +The implementation is of-course a bit more complicated: +<enum> +<item> The path is checked that it is not an absolute path (from <tt>/</>). + If it is, we let the <tt>general cd</> to do the job by calling + directly <tt>type_ext2___cd</>. +<item> The path is divided into the nearest path and the rest of the path. + For example, cd 1/2/3/4 is divided into <tt>1</> and into + <tt>2/3/4</>. +<item> It is the first part of the path that we need to search for in the + current directory. We search for it using <tt>search_dir_entries</>, + which accepts the <tt>action_name</> function as the user defined + function. +<item> <tt>search_dir_entries</> will scan the entire entries and will call + our <tt>action_name</> function for each entry. In + <tt>action_name</>, the required name will be checked against the + name of the current entry, and <tt>FOUND</> will be returned when a + match occurs. +<item> If the required entry is found, we dispatch a <tt>remember</> + command to insert the current <tt>inode</> into the object memory. + This is required to easily support <tt>symbolic links</> - If we + find later that the inode pointed by the entry is actually a + symbolic link, we'll need to return to this point, and the above + inode doesn't have (and can't have, because of <tt>hard links</>) the + information necessary to "move back". +<item> We then dispatch a <tt>followinode</> command to reach the inode + pointed by the required entry. This command will automatically + change the type to <tt>ext2_inode</> - We are now at an inode, and + all the inode commands are available. +<item> We check the inode's type to see if it is a directory. If it is, we + dispatch a <tt>dir</> command to "enter the directory", and + recursively call ourself (The type is <tt>dir</> again) by + dispatching a <tt>cd</> command, with the rest of the path as an + argument. +<item> If the inode's type is a symbolic link (only fast symbolic link were + meanwhile implemented. I guess this is typically the case.), we note + the path it is pointing at, the saved inode is recalled, we dispatch + <tt>dir</> to get back to the original directory, and we call + ourself again with the <tt>link path/rest of the path</> argument. +<item> In any other case, we just stop at the resulting inode. +</enum> + +<sect>The block and inode allocation bitmaps +<p> + +The block allocation bitmap is reached by the corresponding group descriptor. +The group descriptor handling functions will save the necessary information +into a structure of the <tt>struct_block_bitmap_info</> type: + +<tscreen><code> +struct struct_block_bitmap_info { + unsigned long entry_num; + unsigned long group_num; +}; +</code></tscreen> + +The <tt>show</> command is overridden, and will show the block as a series of +bits, each bit corresponding to a block. The main variable is the +<tt>entry_num</> variable, declared above, which is just the current block +number in this block group. The current entry is highlighted, and the +<tt>next, prev and entry</> commands just change the above variable. + +The <tt>allocate and deallocate</> change the specified bits. Nothing +special about them - They just contain code which converts between bit and +byte locations. + +The <tt>inode allocation bitmap</> is treated in much the same fashion, with +the same commands available. + +<sect>Filesystem size limitation +<p> + +While an ext2 filesystem has a size limit of <tt>4 TB</>, EXT2ED currently +<tt>can't</> handle filesystems which are <tt>bigger than 2 GB</>. + +This limitation results from my usage of <tt>32 bit long variables</> and +of the <tt>fseek</> filesystem call, which can't seek up to 4 TB. + +By looking in the <tt>ext2 library</> source code by <tt>Theodore Ts'o</>, +I discovered the <tt>llseek</> system call which can seek to a +<tt>64 bit unsigned long long</> offset. Correcting the situation is not +difficult in concept - I need to change long into unsigned long long where +appropriate and modify <tt>disk.c</> to use the llseek system call. + +However, fixing the above limitation involves making changes in many places +in the code and will obviously make the entire code less stable. For that +reason, I chose to release EXT2ED as it is now and to postpone the above fix +to the next release. + +<sect>Conclusion +<p> + +Had I known in advance the structure of the ext2 filesystem, I feel that +the resulting design would have been quite different from the presented +design above. + +EXT2ED has now two levels of abstraction - A <tt>general</> filesystem, and an +<tt>ext2</> filesystem, and the surface is more or less prepared for additions +of other filesystems. Had I approached the design in the "engineering" way, +I guess that the first level above would not have existed. + +<sect>Copyright +<p> + +EXT2ED is Copyright (C) 1995 Gadi Oxman. + +EXT2ED is hereby placed under the GPL - Gnu Public License. You are free and +welcome to copy, view and modify the sources. My only wish is that my +copyright presented above will be left and that a list of the bug fixes, +added features, etc, will be provided. + +The entire EXT2ED project is based, of-course, on the kernel sources. The +<tt>ext2.descriptors</> distributed with EXT2ED is a slightly modified +version of the main ext2 include file, /usr/include/linux/ext2_fs.h. Follows +the original copyright: + +<tscreen><verb> +/* + * linux/include/linux/ext2_fs.h + * + * Copyright (C) 1992, 1993, 1994, 1995 + * Remy Card (card@masi.ibp.fr) + * Laboratoire MASI - Institut Blaise Pascal + * Universite Pierre et Marie Curie (Paris VI) + * + * from + * + * linux/include/linux/minix_fs.h + * + * Copyright (C) 1991, 1992 Linus Torvalds + */ + +</verb></tscreen> + +<sect>Acknowledgments +<p> + +EXT2ED was constructed as a student project in the software +laboratory of the faculty of electrical-engineering in the +<tt>Technion - Israel's institute of technology</>. + +At first, I would like to thank <tt>Avner Lottem</> and <tt>Doctor Ilana +David</> for their interest and assistance in this project. + +I would also like to thank the following people, who were involved in the +design and implementation of the ext2 filesystem kernel code and support +utilities: +<itemize> +<item> <tt>Remy Card</> + + Who designed, implemented and maintains the ext2 filesystem kernel + code, and some of the ext2 utilities. <tt>Remy Card</> is also the + author of several helpful slides concerning the ext2 filesystem. + Specifically, he is the author of <tt>File Management in the Linux + Kernel</> and of <tt>The Second Extended File System - Current + State, Future Development</>. + +<item> <tt>Wayne Davison</> + + Who designed the ext2 filesystem. +<item> <tt>Stephen Tweedie</> + + Who helped designing the ext2 filesystem kernel code and wrote the + slides <tt>Optimizations in File Systems</>. +<item> <tt>Theodore Ts'o</> + + Who is the author of several ext2 utilities and of the ext2 library + <tt>libext2fs</> (which I didn't use, simply because I didn't know + it exists when I started to work on my project). +</itemize> + +Lastly, I would like to thank, of-course, <tt>Linus Torvalds</> and the +<tt>Linux community</> for providing all of us with such a great operating +system. + +Please contact me in a case of bug report, suggestions, or just about +anything concerning EXT2ED. + +Enjoy, + +Gadi Oxman <tgud@tochnapc2.technion.ac.il> + +Haifa, August 95 +</article>
\ No newline at end of file diff --git a/ext2ed/doc/ext2ed.8 b/ext2ed/doc/ext2ed.8 new file mode 100644 index 00000000..e153ff8b --- /dev/null +++ b/ext2ed/doc/ext2ed.8 @@ -0,0 +1,72 @@ +.\" -*- nroff -*- +.TH EXT2ED 8 "August 1995" "Version 0.1" +.SH NAME +ext2ed \- ext2 file system editor +.SH SYNOPSIS +.B ext2ed +.SH DESCRIPTION +.B ext2ed +in an +.B editor +for the +.B second extended filesystem. +Its aim is to show you the various internal filesystem structures in an +intuitive form so that you would be able to easily understand and modify +them. +.SH DOCUMENTATION +The documentation is not available in man page format. Instead, I have +written three articles which are related to ext2ed: + +The first article is +.B The user's guide. +This article explains how to use ext2ed. + +The second article is +.B The Ext2fs overview. +This article gives an overview of internal structure of the ext2 filesystem. +You need to understand the internal layout in order to effectively edit +your filesystem. + +The third article is +.B EXT2ED - Design and implementation. +This article explains how I constructed ext2ed. You may want to have a look +in it if you plan to view or modify the source code. + +.SH WARNING + +.B +Do not use ext2ed on a mounted filesystem. + +.SH FILES +.TP +.I /usr/bin/ext2ed +The program itself. +.TP +.I /var/lib/ext2ed/ext2ed.conf +ext2ed's configuration file. +.TP +.I /var/lib/ext2ed/ext2.descriptors +Definition of the various objects for the ext2 filesystem. +.TP +.I /var/lib/ext2ed/ext2ed.log +Log file of actual changes made to the filesystem. +.TP +.I /usr/man/man8/ext2ed.8 +The manual page. +.TP +.I /usr/doc/ext2ed/user-guide-0.1.ps +The user's guide. +.TP +.I /usr/doc/ext2ed/Ext2fs-overview-0.1.ps +Technical overview of the ext2 filesystem. +.TP +.I /usr/doc/ext2ed/ext2ed-design-0.1.ps +EXT2ED design notes. + +.SH BUGS +Filesystems bigger than 2 GB aren't yet supported. +.SH AUTHOR +Gadi Oxman <tgud@tochnapc2.technion.ac.il> +.SH SEE ALSO +.BR e2fsck (8), +.BR debugfs (8) diff --git a/ext2ed/doc/user-guide-0.1.sgml b/ext2ed/doc/user-guide-0.1.sgml new file mode 100644 index 00000000..c494a7e7 --- /dev/null +++ b/ext2ed/doc/user-guide-0.1.sgml @@ -0,0 +1,1189 @@ +<!doctype linuxdoc system> + +<!-- EXT2ED user's guide --> +<!-- First written: July 22 1995 --> +<!-- Last updated: August 3 1995 --> +<!-- This document is written Using the Linux documentation project Linuxdoc-SGML DTD --> + +<article> + +<title>EXT2ED - The Extended-2 filesystem editor - User's guide +<author>Gadi Oxman, tgud@tochnapc2.technion.ac.il +<date>v0.1, August 3 1995 +<abstract> +This is only the initial version of this document. It may be unclear at +some places. Please send me feedback with anything regarding to it. +</abstract> +<toc> + +<!-- Begin of document --> + +<sect>About EXT2ED documentation +<p> + +The EXT2ED documentation consists of three parts: +<itemize> +<item> The ext2 filesystem overview. +<item> The EXT2ED user's guide. +<item> The EXT2ED design and implementation. +</itemize> + +If you intend to used EXT2ED, I strongly suggest that you would be familiar +with the material presented in the <tt>ext2 filesystem overview</> as well. + +If you also intend to browse and modify the source code, I suggest that you +will also read the article <tt>The EXT2ED design and implementation</>, as it +provides a general overview of the structure of my source code. + +<sect>Introduction + +<p> +EXT2ED is a "disk editor" for the ext2 filesystem. Its purpose is to show +you the internal structures of the ext2 filesystem in an rather intuitive +and logical way, so that it will be easier to "travel" between the various +internal filesystem structures. + +<sect>Basic concepts in EXT2ED + +<p> +Two basic concepts in EXT2ED are <tt>commands</> and <tt>types</>. + +EXT2ED is object-oriented in the sense that it defines objects in the +filesystem, like a <tt>super-block</> or a <tt>directory</>. An object is +something which "knows" how to handle some aspect of the filesystem. + +Your interaction with EXT2ED is done through <tt>commands</> which EXT2ED +accepts. There are three levels of commands: +<itemize> +<item> General Commands +<item> Extended-2 Filesystem general commands +<item> Type specific commands +</itemize> +The General commands are always available. + +The ext2 general commands are available only when editing an ext2 filesystem. + +The Type specific commands are available when editing a specific object in the +filesystem. Each object typically comes with its own set of internal +variables, and its own set of commands, which are fine tuned handle the +corresponding structure in the filesystem. +<sect>Running EXT2ED +<p> +Running EXT2ED is as simple as typing <tt>ext2ed</> from the shell prompt. +There are no command line switches. + +When first run, EXT2ED parses its configuration file, <tt>ext2ed.conf</>. +This file must exist. + +When the configuration file processing is done, EXT2ED screen should appear +on the screen, with the command prompt <tt>ext2ed></> displayed. + +<sect>EXT2ED user interface + +<p> +EXT2ED uses the <em>ncurses</> library for screen management. Your screen +will be divided into four parts, from top to bottom: +<itemize> +<item> Title window +<item> Status window +<item> Main editing window +<item> Command window +</itemize> +The title window just displays the current version of EXT2ED. + +The status window will display various information regarding the state of +the editing at this point. + +The main editing window is the place at which the actual data will be shown. +Almost every command will cause some display at this window. This window, as +opposed to the three others, is of variable length - You always look at one +page of it. The current page and the total numbers of pages at this moment +is displayed at the status window. Moving between pages is done by the use +of the <tt>pgdn</> and <tt>pgup</> commands. + +The command window is at the bottom of the screen. It always displays a +command prompt <tt>ext2ed></> and allows you to type a command. Feedback +about the commands entered is displayed to this window also. + +EXT2ED uses the <em>readline</> library while processing a command line. All +the usual editing keys are available. Each entered command is placed into a +history of commands, and can be recalled later. Command Completion is also +supported - Just start to type a command, and press the completion key. + +Pressing <tt>enter</> at the command window, without entering a command, +recalls the last command. This is useful when moving between close entries, +in the <tt>next</> command, for example. + +<sect>Getting started + +<p> + +<sect1>A few precautions + +<p> + +EXT2ED is a tool for filesystem <tt>editing</>. As such, it can be +<tt>dangerous</>. The summary to the subsections below is that +<tt>You must know what you are doing</>. + +<sect2><label id="mounted_ref">A mounted filesystem + +<p> + +EXT2ED is not designed to work on a mounted filesystem - It is complicated +enough as it is; I didn't even try to think of handling the various race +conditions. As such, please respect the following advice: + +<tt>Do not use EXT2ED on a mounted filesystem !</> + +EXT2ED will not allow write access to a mounted filesystem. Although it is +fairly easy to change EXT2ED so that it will be allowed, I hereby request +again- EXT2ED is not designed for that action, and will most likely corrupt +data if used that way. Please don't do that. + +Concerning read access, I chose to leave the decision for the user through +the configuration file option <tt>AllowMountedRead</>. Although read access +on a mounted partition will not do any damage to the filesystem, the data +displayed to you will not be reliable, and showing you incorrect information +may be as bad as corrupting the filesystem. However, you may still wish to +do that. + +<sect2>Write access + +<p> + +Considering the obvious sensitivity of the subject, I took the following +actions: + +<enum> +<item> EXT2ED will always start with a read-only access. Write access mode + needs to be specifically entered by the <tt>enablewrite</> command. + Until this is done, no write will be allowed. Write access can be + disabled at any time with <tt>disablewrite</>. When + <tt>enablewrite</> is issued, the device is reopened in read-write + mode. Needless to say, the device permissions should allow that. +<item> As a second level of protection, you can disallow write access in + the configuration file by using the <tt>AllowChanges off</> + configuration option. In this case, the <tt>enablewrite</> command + will be refused. +<item> When write access is enabled, the data will never change + immediately. Rather, a specific <tt>writedata</> command is needed + to update the object in the disk with the changed object in memory. +<item> In addition, A logging option is provided through the configuration + file options <tt>LogChanges</> and <tt>LogFile</>. With logging + enabled, each change to the disk will be logged at a very primitive + level - A hex dump of the original data and of the new written data. + The log file will be a text file which is easily readable, and you + can make use of it to undo any changes which you made (EXT2ED doesn't + make use of the log file for that purpose, it just logs the changes). +</enum> +Please remember that this is only the initial release of EXT2ED, and it is +not very much tested - It is reasonable to assume that <tt>there are +bugs</>. +However, the logging option above can offer protection even from this +unfortunate case. Therefor, I highly recommend that at least when first +working with EXT2ED, the logging option will be enabled, despite the disk +space which it consumes. + +<sect1><label id="help_ref">The help command + +<p> + +When loaded, EXT2ED will show a short help screen. This help screen can +always be retrieved by the command <tt>help</>. The help screen displays a +list of all the commands which are available at this point. At startup, only +the <tt>General commands</> are available. +This will change with time, since each object has its own commands. Thus, +commands which are available now may not be available later. +Using <tt>help</> <em>command</> will display additional information about +the specific command <em>command</>. + +<sect1><label id="setdevice_ref">The setdevice command + +<p> + +The first command that is usually entered to EXT2ED is the <tt>setdevice</> +command. This command simply tells EXT2ED on which device the filesystem is +present. For example, suppose my ext2 filesystem is on the first partition +of my ide disk. The command will be: +<tscreen><verb> +setdevice /dev/hda1 +</verb></tscreen> +The following actions will take place in the following order: +<enum> +<item> EXT2ED will check if the partition is mounted. + If the partition is mounted (<tt>highly not recommended</>), + the accept/reject behavior will be decided by the configuration + file. Cross reference section <ref id="mounted_ref">. +<item> The specified device will be opened in read-only mode. The + permissions of the device should be set in a way that allows + you to open the device for read access. +<item> Autodetection of an ext2 filesystem will be made by searching for + the ext2 magic number in the main superblock. +<item> In the case of a successful recognition of an ext2 filesystem, the + ext2 filesystem specific commands and the ext2 specific object + definitions will be registered. The object definitions will be read + at run time from a file specified by the configuration file. + + In case of a corrupted ext2 filesystem, it is quite possible that + the main superblock is damaged and autodetection will fail. In that + case, use the configuration option <tt>ForceExt2 on</>. This is not + the default case since EXT2ED can be used at a lower level to edit a + non-ext2 filesystem. +<item> In a case of a successful autodetection, essential information about + the filesystem such as the block size will be read from the + superblock, unless the used overrides this behavior with an + configuration option (not recommended). In that case, the parameters + will be read from the configuration file. + + In a case of an autodetection failure, the essential parameters + will be read from the configuration file. +</enum> +Assuming that you are editing an ext2 filesystem and that everything goes +well, you will notice that additional commands are now available in the help +screen, under the section <tt>ext2 filesystem general commands</>. In +addition, EXT2ED now recognizes a few objects which are essential to the +editing of an ext2 filesystem. + +<sect>Two levels of usage + +<p> + +<sect1>Low level usage + +<p> +This section explains what EXT2ED provides even when not editing an ext2 +filesystem. + +Even at this level, EXT2ED is more than just a hex editor. It still allows +definition of objects and variables in run time through a user file, +although of-course the objects will not have special fine tuned functions +connected to them. EXT2ED will allow you to move in the filesystem using +<tt>setoffset</>, and to apply an object definition on a specific place +using <tt>settype</> <em>type</>. From this point and on, the object will +be shown <tt>in its native form</> - You will see a list of the +variables rather than just a hex dump, and you will be able to change each +variable in the intuitive form <tt>set variable=value</>. + +To define objects, use the configuration option <tt>AlternateDescriptors</>. + +There are now two forms of editing: +<itemize> +<item> Editing without a type. In this case, the disk block will be shown +as a text+hex dump, and you will be able to move along and change it. +<item> Editing with a type. In this case, the object's variables will be +shown, and you will be able to change each variable in its native form. +</itemize> + +<sect1>High level usage + +<p> +EXT2ED was designed for the editing of the ext2 filesystem. As such, it +"understands" the filesystem structure to some extent. Each object now has +special fine tuned 'C' functions connected to it, which knows how to display +it in an intuitive form, and how the object fits in the general design of +the ext2 filesystem. It is of-course much easier to use this type of +editing. For example: +<tscreen> +Issue <em>group 2</> to look at the main copy of the third group block +descriptor. With <em>gocopy 1</> you can move to its first backup copy, +and with <em>inode</> you can start editing the inode table of the above +group block. From here, if the inode corresponds to a file, you can +use <em>file</> to edit the file in a "continuous" way, using +<em>nextblock</> to pass to its next block, letting EXT2ED following by +itself the direct blocks, indirect blocks, ..., while still preserving the +actual view of the exact block usage of the file. +</tscreen> +The point is that the "tour" of the filesystem will now be synchronic rather +than asynchronic - Each object has the "links" to pass between connected +logical structures, and special fine-tuned functions to deal with it. + +<sect>General commands + +<p> +I will now start with a systematic explanation of the general commands. +Please feel free to experiment, but take care when using the +<tt>enablewrite</> command. + +Whenever a command syntax is specified, arguments which are optional are +enclosed with square brackets. + +Please note that in EXT2ED, each command can be overridden by a specific +object to provide special fine-tuned functionality. In general, I was +attempting to preserve the similarity between those functions, which are +accessible by the same name. + +<sect1><label id="disablewrite_ref">disablewrite +<p> +<tscreen><verb> +Syntax: disablewrite +</verb></tscreen> +<tt>disablewrite</> is used to reopen the device with read-only access. When +first running EXT2ED, the device is opened in read-only mode, and an +explicit <tt>enablewrite</> is required for write access. When finishing +with changing, a <tt>disablewrite</> is recommended for safety. Cross +reference section <ref id="disablewrite_ref">. + +<sect1><label id="enablewrite_ref">enablewrite +<p> +<tscreen><verb> +Syntax: enablewrite +</verb></tscreen> +<tt>enablewrite</> is used to reopen the device with read-write access. +When first running EXT2ED, the device is opened in read-only mode, and an +explicit <tt>enablewrite</> is required for write access. +<tt>enablewrite</> will fail if write access is disabled from the +configuration file by the <tt>AllowChanges off</> configuration option. +Even after <tt>enablewrite</>, an explicit <tt>writedata</> +is required to actually write the new data to the disk. +When finishing with changing, a <tt>disablewrite</> is recommended for safety. +Cross reference section <ref id="enablewrite_ref">. + +<sect1>help +<p> +<tscreen><verb> +Syntax: help [command] +</verb></tscreen> +The <tt>help</> command is described at section <ref id="help_ref">. + +<sect1><label id="next_ref">next +<p> +<tscreen><verb> +Syntax: next [number] +</verb></tscreen> +This section describes the <em>general command</> <tt>next</>. <tt>next</> +is overridden by several types in EXT2ED, to provide fine-tuned +functionality. + +The <tt>next general command</> behavior is depended on whether you are editing a +specific object, or none. + +<itemize> +<item> In the case where Type is <tt>none</> (The current type is showed + on the status window by the <tt>show</> command), <tt>next</> + passes to the next <em>number</> bytes in the current edited block. + If <em>number</> is not specified, <em>number=1</> is assumed. +<item> In the case where Type is defined, the <tt>next</> commands assumes + that you are editing an array of objects of that type, and the + <tt>next</> command will just pass to the next entry in the array. + If <em>number</> is defined, it will pass <em>number</> entries + ahead. +</itemize> + +<sect1><label id="pgdn_ref">pgdn +<p> +<tscreen><verb> +Syntax: pgdn +</verb></tscreen> +Usually the edited data doesn't fit into the visible main window. In this +case, the status window will indicate that there is more to see "below" by +the message <tt>Page x of y</>. This means that there are <em>y</> pages +total, and you are currently viewing the <em>x</> page. With the <tt>pgdn</> +command, you can pass to the next available page. + +<sect1>pgup +<p> +<tscreen><verb> +Syntax: pgup +</verb></tscreen> + +<tt>pgup</> is the opposite of <tt>pgdn</> - It will pass to the previous +page. Cross reference section <ref id="pgdn_ref">. + +<sect1>prev +<p> +<tscreen><verb> +Syntax: prev [number] +</verb></tscreen> + +<tt>prev</> is the opposite of <tt>next</>. Cross reference section +<ref id="next_ref">. + +<sect1><label id="recall_ref">recall +<p> +<tscreen><verb> +Syntax: recall object +</verb></tscreen> +<tt>recall</> is the opposite of <tt>remember</>. It will place you at the +place you where when saving the object position and type information. Cross +reference section <ref id="remember_ref">. + +<sect1>redraw +<p> +<tscreen><verb> +Syntax: redraw +</verb></tscreen> +Sometimes the screen display gets corrupted. I still have problems with +this. The <tt>redraw</> command simply redraws the entire display screen. + +<sect1><label id="remember_ref">remember +<p> +<tscreen><verb> +Syntax: remember object +</verb></tscreen> +EXT2ED provides you <tt>memory</> of objects; While editing, you may reach an +object which you will like to return to later. The <tt>remember</> command +will store in memory the current place and type of the object. You can +return to the object by using the <tt>recall</> command. Cross reference +section <ref id="recall_ref">. + +<tt>Note:</> +<itemize> +<item> When remembering a <tt>file</> or a <tt>directory</>, the + corresponding inode will be saved in memory. The basic reason is that + the inode is essential for finding the blocks of the file or the + directory. +</itemize> + +<sect1>set +<p> +<tscreen><verb> +Syntax: set [text || hex] arg1 [arg2 arg3 ...] + +or + +Syntax: set variable=value +</verb></tscreen> +The <tt>set</> command is used to modify the current data. +The <tt>set general command</> behavior is depended on whether you are editing a +specific object, or none. + +<itemize> +<item> In the case where Type is <tt>none</>, the first syntax should be + used. The set command affects the data starting at the current + highlighted position in the edited block. + <itemize> + <item> When using the <tt>set hex</> command, a list of + hexadecimal bytes should follow. + <item> When using the <tt>set text</> command, it should be followed + by a text string. + </itemize> + Examples: + <tscreen><verb> + set hex 09 0a 0b 0c 0d 0e 0f + set text Linux is just great ! + </verb></tscreen> +<item> In the case where Type is defined, the second syntax should be used. + The set commands just sets the variable <em>variable</> with the + value <em>value</>. +</itemize> +In any case, the data is only changed in memory. For an actual update to the +disk, use the <tt>writedata</> command. + +<sect1>setdevice +<p> +<tscreen><verb> +Syntax: setdevice device +</verb></tscreen> +The <tt>setdevice</> command is described at section <ref id="setdevice_ref">. + +<sect1>setoffset +<p> +<tscreen><verb> +Syntax: setoffset [block || type] [+|-]offset +</verb></tscreen> +The <tt>setoffset</> command is used to move asynchronically inside the file +system. It is considered a low level command, and usually should not be used +when editing an ext2 filesystem, simply because movement is better +utilized through the specific ext2 commands. + +The <tt>offset</> is in bytes, and meanwhile should be positive and smaller +than 2GB. + +Use of the <tt>block</> modifier changes the counting unit to block. + +Use of the <tt>+ or -</> modifiers signals that the offset is relative to +the current position. + +use of the <tt>type</> modifier is allowed only with relative offset. This +modifier will multiply the offset by the size of the current type. + +<sect1>settype +<p> +<tscreen><verb> +Syntax: settype type || [none | hex] +</verb></tscreen> +The <tt>settype</> command is used to move apply the object definitions of +the type <em>type</> on the current position. It is considered a low level +command and usually should not be used when editing an ext2 filesystem since +EXT2ED provides better tools. It is of-course very useful when editing a +non-ext2 filesystem and using user-defined objects. + +When <em>type</> is <em>hex</> or <em>none</>, the data will be displayed as +a hex and text dump. + +<sect1>show +<p> +<tscreen><verb> +Syntax: show +</verb></tscreen> +The <tt>show</> command will show the data of the current object at the +current position on the main display window. It will also update the status +window with type specific information. It may be necessary to use +<tt>pgdn</> and <tt>pgup</> to view the entire data. + +<sect1>writedata +<p> +<tscreen><verb> +Syntax: writedata +</verb></tscreen> +The <tt>writedata</> command will update the disk with the object data that +is currently in memory. This is the point at which actual change is made to +the filesystem. Without this command, the edited data will not have any +effect. Write access should be allowed for a successful update. + +<sect>Editing an ext2 filesystem +<p> + +In order to edit an ext2 filesystem, you should, of course, know the structure +of the ext2 filesystem. If you feel that you lack some knowledge in this +area, I suggest that you do some of the following: +<itemize> +<item> Read the supplied ext2 technical information. I tried to summarize + the basic information which is needed to get you started. +<item> Get the slides that Remy Card (The author of the ext2 filesystem) + prepared concerning the ext2 filesystem. +<item> Read the kernel sources. +</itemize> +At this point, you should be familiar with the following terms: +<tt>block, inode, superblock, block groups, block allocation bitmap, inode +allocation bitmap, group descriptors, file, directory.</>Most of the above +are objects in EXT2ED. + +When editing an ext2 filesystem it is recommended that you use the ext2 +specific commands, rather then the general commands <tt>setoffset</> and +<tt>settype</>, mainly because: +<enum> +<item> In most cases it will be unreliable, and will display incorrect + information. + + Sometimes in order to edit an object, EXT2ED needs the information + of some other related objects. For example, when editing a + directory, EXT2ED needs access to the inode of the edited directory. + Simply setting the type to a directory <tt>will be unreliable</>, + since the object assumes that you passed through its inode to reach + it, and expects this information, which isn't initialized if you + directly set the type to a directory. +<item> EXT2ED offers far better tools for handling the ext2 filesystem + using the ext2 specific commands. +</enum> + +<sect>ext2 general commands +<p> + +The <tt>ext2 general commands</> are available only when you are editing an +ext2 filesystem. They are <tt>general</> in the sense that they are not +specific to some object, and can be invoked anytime. + +<sect1><label id="general_superblock">super +<p> +<tscreen><verb> +Syntax: super +</verb></tscreen> +The <tt>super</> command will "bring you" to the main superblock copy. It +will automatically set the object type to <tt>ext2_super_block</>. Then you +will be able to view and edit the superblock. When you are in the +superblock, other commands will be available. + +<sect1>group +<p> +<tscreen><verb> +Syntax: group [number] +</verb></tscreen> +The <tt>group</> command will "bring you" to the main copy of the +<em>number</> group descriptor. It will automatically set the object type to +<tt>ext2_group_desc</>. Then you will be able to view and edit the group +descriptor entry. When you are there, other commands will be available. + +<sect1>cd +<p> +<tscreen><verb> +Syntax: cd path +</verb></tscreen> +The <tt>cd</> command will let you travel in the filesystem in the nice way +that the mounted filesystem would have let you. + +The <tt>cd</> command is a complicated command. Although it may sound +simple at first, an implementation of a typical cd requires passing through +the group descriptors, inodes, directory entries, etc. For example: + +The innocent cd /usr command can be done by using more primitive +EXT2ED commands in the following way (It is implemented exactly this way): +<enum> +<item> Using <tt>group 0</> to go to the first group descriptor. +<item> Using <tt>inode</> to get to the Bad blocks inode. +<item> Using <tt>next</> to pass to the root directory inode. +<item> Using <tt>dir</> to see the directory. +<item> Using <tt>next</> until we find the directory usr. +<item> Using <tt>followinode</> to pass to the inode corresponding to usr. +<item> Using <tt>dir</> to see the directory of /usr. +</enum> +And those commands aren't that primitive; For example, the tracing of the +blocks which belong to the root directory is done automatically by the dir +command behind the scenes, and the followinode command will automatically +"run" to the correct group descriptor in order to find the required inode. + +The path to the <tt>general cd</> command needs to be a full pathname - +Starting from <tt>/</>. The <tt>cd</> command stops at the last reachable +point, which can be a directory entry, in which case the type will be set to +<tt>dir</>, or an inode, in which case the type will be set to +<tt>ext2_inode</>. Symbolic links (Only fast symbolic links, meanwhile) are +automatically followed (if they are not across filesystems, of-course). If +the type is set to <tt>dir</>, you can use a path relative to the +"current directory". + +<sect>The superblock +<p> +The superblock can always be reached by the ext2 general command +<tt>super</>. Cross reference section <ref id="general_superblock">. + +The status window will show you which copy of the superblock copies you are +currently editing. + +The main data window will show you the values of the various superblock +variables, along with some interpretation of the values. + +Data can be changed with the <tt>set</> and <tt>writedata</> commands. +<tscreen><verb> +For example, set s_r_blocks_count=1400 will reserve 1400 blocks for root. +</verb></tscreen> + +<sect1>gocopy +<p> +<tscreen><verb> +Syntax: gocopy number +</verb></tscreen> +The <tt>gocopy</> command will "bring you" to the backup copy <em>number</> +of the superblock copies. <tt>gocopy 0</>, for example, will bring you to +the main copy. + +<sect1>setactivecopy +<p> +<tscreen><verb> +Syntax: setactivecopy +</verb></tscreen> +The <tt>setactivecopy</> command will copy the contents of the current +superblock copy onto the contents of the main copy. It will also switch to +editing of the main copy. No actual data is written to disk, of-course, +until you issue the <tt>writedata</> command. + +<sect>The group descriptors +<p> +The group descriptors can be edited by the <tt>group</> command. + +The status window will indicate the current group descriptor, the total +number of group descriptors (and hence of group blocks), and the backup copy +number. + +The main data window will just show you the values of the various variables. + +Basically, you can use the <tt>next</> and <tt>prev</> commands, along with the +<tt>set</> command, to modify the group descriptors. + +The group descriptors object is a junction, from which you can reach: +<itemize> +<item> The inode table of the corresponding block group (the <tt>inode</> + command) +<item> The block allocation bitmap (the <tt>blockbitmap</> command) +<item> The inode allocation bitmap (the <tt>inodebitmap</> command) +</itemize> + +<sect1>blockbitmap +<p> +<tscreen><verb> +Syntax: blockbitmap +</verb></tscreen> +The <tt>blockbitmap</> command will let you edit the block bitmap allocation +block of the current group block. + +<sect1>entry +<p> +<tscreen><verb> +Syntax: entry number +</verb></tscreen> +The <tt>entry</> command will move you to the <em>number</> group descriptor in the +group descriptors table. + +<sect1>inode +<p> +<tscreen><verb> +Syntax: inode +</verb></tscreen> +The <tt>inode</> command will pass you to the first inode in the current +group block. + +<sect1>inodebitmap +<p> +<tscreen><verb> +Syntax: inodebitmap +</verb></tscreen> +The <tt>inodebitmap</> command will let you edit the inode bitmap allocation +block of the current group block. + +<sect1>next +<p> +<tscreen><verb> +Syntax: next [number] +</verb></tscreen> +The <tt>next</> command will pass to the next <em>number</> group +descriptor. If <em>number</> is omitted, <em>number=1</> is assumed. + +<sect1>prev +<p> +<tscreen><verb> +Syntax: prev [number] +</verb></tscreen> +The <tt>prev</> command will pass to the previous <em>number</> group +descriptor. If <em>number</> is omitted, <em>number=1</> is assumed. + +<sect1>setactivecopy +<p> +<tscreen><verb> +Syntax: setactivecopy +</verb></tscreen> +The <tt>setactivecopy</> command copies the contents of the current group +descriptor, to its main copy. The updated main copy will then be shown. No +actual change is made to the disk until you issue the <tt>writedata</> +command. + +<sect>The inode +<p> +An inode can be reached by the following two ways: +<itemize> +<item> Using <tt>inode</> from the corresponding group descriptor. +<item> Using <tt>followinode</> from a directory entry. +<item> Using the <tt>cd</> command with the pathname to the file. + + For example, <tt>cd /usr/src/ext2ed/ext2ed.h</> +</itemize> + +The status window will indicate: +<itemize> +<item> The current global inode number. +<item> The total total number of inodes. +<item> On which block group the inode is allocated. +<item> The total number of inodes in this group block. +<item> The index of the current inode in the current group block. +<item> The type of the inode (file, directory, special, etc). +</itemize> + +The main data window, in addition to the list of variables, will contain +some interpretations on the right side. + +If the inode corresponds to a file, you can use the <tt>file</> command to +edit the file. + +If the inode is an inode of a directory, you can use the <tt>dir</> command +to edit the directory. + +<sect1>dir +<p> +<tscreen><verb> +Syntax: dir +</verb></tscreen> +If the inode mode corresponds to a directory (shown on the status window), +you can enter directory mode editing by using <tt>dir</>. + +<sect1>entry +<p> +<tscreen><verb> +Syntax: entry number +</verb></tscreen> +The <tt>entry</> command will move you to the <em>number</> inode in the +current inode table. + +<sect1>file +<p> +<tscreen><verb> +Syntax: file +</verb></tscreen> +If the inode mode corresponds to a file (shown on the status window), +you can enter file mode editing by using <tt>file</>. + +<sect1>group +<p> +<tscreen><verb> +Syntax: group +</verb></tscreen> +The <tt>group</> command is used to go to the group descriptor of the +current group block. + +<sect1>next +<p> +<tscreen><verb> +Syntax: next [number] +</verb></tscreen> +The <tt>next</> command will pass to the next <em>number</> inode. +If <em>number</> is omitted, <em>number=1</> is assumed. + +<sect1>prev +<p> +<tscreen><verb> +Syntax: prev [number] +</verb></tscreen> +The <tt>prev</> command will pass to the previous <em>number</> inode. +If <em>number</> is omitted, <em>number=1</> is assumed. + +<sect>The file +<p> +When editing a file, EXT2ED offers you a both a continuous and a true +fragmented view of the file - The file is still shown block by block with +the true block number at each stage and EXT2ED offers you commands which +allow you to move between the <tt>file blocks</>, while finding the +allocated blocks by using the inode information behind the scenes. + +Aside from this, the editing is just a <tt>hex editing</> - You move the +cursor in the current block of the file by using <tt>next</> and +<tt>prev</>, move between blocks by <tt>nextblock</> and <tt>prevblock</>, +and make changes by the <tt>set</> command. Note that the set command is +overridden here - There are no variables. The <tt>writedata</> command will +update the current block to the disk. + +Reaching a file can be done by using the <tt>file</> command from its inode. +The <tt>inode</> can be reached by any other means, for example, by the +<tt>cd</> command, if you know the file name. + +The status window will indicate: +<itemize> +<item> The global block number. +<item> The internal file block number. +<item> The file offset. +<item> The file size. +<item> The file inode number. +<item> The indirection level - Whether it is a direct block (0), indirect + (1), etc. +</itemize> + +The main data window will display the file either in hex mode or in text +mode, select-able by the <tt>display</> command. + +In hex mode, EXT2ED will display offsets in the current block, along with a +text and hex dump of the current block. + +In either case the <tt>current place</> will be highlighted. In the hex mode +it will be always highlighted, while in the text mode it will be highlighted +if the character is display-able. + +<sect1>block +<p> +<tscreen><verb> +Syntax: block block_num +</verb></tscreen> +The <tt>block</> command is used to move inside the file. The +<em>block_num</> argument is the requested internal file block number. A +value of 0 will reach the beginning of the file. + +<sect1>display +<p> +<tscreen><verb> +Syntax: display [text || hex] +</verb></tscreen> +The <tt>display</> command changes the display mode of the file. <tt>display +hex</> will switch to <tt>hex mode</>, while <tt>display text</> will switch +to text mode. The default mode when no <tt>display</> command is issued is +<tt>hex mode</>. + +<sect1>inode +<p> +<tscreen><verb> +Syntax: inode +</verb></tscreen> +The <tt>inode</> command will return to the inode of the current file. + +<sect1>next +<p> +<tscreen><verb> +Syntax: next [num] +</verb></tscreen> +The <tt>next</> command will pass to the next byte in the file. If +<em>num</> is supplied, it will pass to the next <em>num</> bytes. + +<sect1>nextblock +<p> +<tscreen><verb> +Syntax: nextblock [num] +</verb></tscreen> +The <tt>nextblock</> command will pass to the next block in the file. If +<em>num</> is supplied, it will pass to the next <em>num</> blocks. + +<sect1>prev +<p> +<tscreen><verb> +Syntax: prev [num] +</verb></tscreen> +The <tt>prev</> command will pass to the previous byte in the file. If +<em>num</> is supplied, it will pass to the previous <em>num</> bytes. + +<sect1>prevblock +<p> +<tscreen><verb> +Syntax: prevblock [num] +</verb></tscreen> +The <tt>nextblock</> command will pass to the previous block in the file. If +<em>num</> is supplied, it will pass to the previous <em>num</> blocks. + +<sect1>offset +<p> +<tscreen><verb> +Syntax: offset file_offset +</verb></tscreen> +The <tt>offset</> command will move to the specified offset in the file. + +<sect1>set +<p> +<tscreen><verb> +Syntax: set [text || hex] arg1 [arg2 arg3 ...] +</verb></tscreen> +The <tt>file set</> command is working like the <tt>general set command</>, +with <tt>type=none</>. There are no variables. + +<sect1>writedata +<p> +<tscreen><verb> +Syntax: writedata +</verb></tscreen> +The <tt>writedata</> command will update the current file block in the disk. + +<sect>The directory +<p> +When editing a file, EXT2ED analyzes for you both the allocation blocks of +the directory entries, and the directory entries. + +Each directory entry is displayed on one row. You can move the highlighted +entry with the usual <tt>next</> and <tt>prev</> commands, and "dive in" +with the <tt>followinode</> command. + +The status window will indicate: +<itemize> +<item> The directory entry number. +<item> The total number of directory entries in this directory. +<item> The current global block number. +<item> The current offset in the entire directory - When viewing the + directory as a continuous file. +<item> The inode number of the directory itself. +<item> The indirection level - Whether it is a direct block (0), indirect + (1), etc. +</itemize> + +<sect1>cd +<p> +<tscreen><verb> +Syntax: cd [path] +</verb></tscreen> +The <tt>cd</> command is used in the usual meaning, like the global cd +command. +<itemize> +<item> If <em>path</> is not specified, the current directory entry is + followed. +<item> <em>path</> can be relative to the current directory. +<item> <em>path</> can also end up in a file, in which case the file inode + will be reached. +<item> Symbolic link (fast only, meanwhile) is automatically followed. +</itemize> + +<sect1>entry +<p> +<tscreen><verb> +Syntax: entry [entry_num] +</verb></tscreen> +The <tt>entry</> command sets <em>entry_num</> as the current directory +entry. + +<sect1>followinode +<p> +<tscreen><verb> +Syntax: followinode +</verb></tscreen> +The <tt>followinode</> command will move you to the inode pointed by the +current directory entry. + +<sect1>inode +<p> +<tscreen><verb> +Syntax: inode +</verb></tscreen> +The <tt>inode</> command will return you to the parent inode of the whole +directory listing. + +<sect1>next +<p> +<tscreen><verb> +Syntax: next [num] +</verb></tscreen> +The <tt>next</> command will pass to the next directory entry. +If <em>num</> is supplied, it will pass to the next <em>num</> entries. + +<sect1>prev +<p> +<tscreen><verb> +Syntax: prev [num] +</verb></tscreen> +The <tt>prev</> command will pass to the previous directory entry. +If <em>num</> is supplied, it will pass to the previous <em>num</> entries. + +<sect1>writedata +<p> +<tscreen><verb> +Syntax: writedata +</verb></tscreen> +The <tt>writedata</> command will write the current directory entry to the +disk. + +<sect><label id="block_bitmap">The block allocation bitmap +<p> +The <tt>block allocation bitmap</> of any block group can be reached from +the corresponding group descriptor. + +You will be offered a bit listing of the entire blocks in the group. The +current block will be highlighted and its number will be displayed in the +status window. + +A value of "1" means that the block is allocated, while a value of "0" +signals that it is free. The value is also interpreted in the status +window. You can use the usual <tt>next/prev</> commands, along with the +<tt>allocate/deallocate</> commands. + +<sect1>allocate +<p> +<tscreen><verb> +Syntax: allocate [num] +</verb></tscreen> +The <tt>allocate</> command allocates <em>num</> blocks, starting from the +highlighted position. If <em>num</> is not specified, <em>num=1</> is assumed. +Of-course, no actual change is made until you issue a <tt>writedata</> command. + +<sect1>deallocate +<p> +<tscreen><verb> +Syntax: deallocate [num] +</verb></tscreen> +The <tt>deallocate</> command deallocates <em>num</> blocks, starting from the +highlighted position. If <em>num</> is not specified, <em>num=1</> is assumed. +Of-course, no actual change is made until you issue a <tt>writedata</> command. +<tt>writedata</> command. + +<sect1>entry +<p> +<tscreen><verb> +Syntax: entry [entry_num] +</verb></tscreen> +The <tt>entry</> command sets the current highlighted block to +<em>entry_num</>. + +<sect1>next +<p> +<tscreen><verb> +Syntax: next [num] +</verb></tscreen> +The <tt>next</> command will pass to the next bit, which corresponds to the +next block. If <em>num</> is supplied, it will pass to the next <em>num</> +bits. + +<sect1>prev +<p> +<tscreen><verb> +Syntax: prev [num] +</verb></tscreen> +The <tt>prev</> command will pass to the previous bit, which corresponds to the +previous block. If <em>num</> is supplied, it will pass to the previous +<em>num</> bits. + +<sect>The inode allocation bitmap +<p> + +The <tt>inode allocation bitmap</> is very similar to the block allocation +bitmap explained above. It is also reached from the corresponding group +descriptor. Please refer to section <ref id="block_bitmap">. + +<sect>Filesystem size limitation +<p> + +While an ext2 filesystem has a size limit of <tt>4 TB</>, EXT2ED currently +<tt>can't</> handle filesystems which are <tt>bigger than 2 GB</>. + +I am sorry for the inconvenience. This will hopefully be fixed in future +releases. + +<sect>Copyright +<p> + +EXT2ED is Copyright (C) 1995 Gadi Oxman. + +EXT2ED is hereby placed under the GPL - Gnu Public License. You are free and +welcome to copy, view and modify the sources. My only wish is that my +copyright presented above will be left and that a list of the bug fixes, +added features, etc, will be provided. + +The entire EXT2ED project is based, of-course, on the kernel sources. The +<tt>ext2.descriptors</> distributed with EXT2ED is a slightly modified +version of the main ext2 include file, /usr/include/linux/ext2_fs.h. Follows +the original copyright: + +<tscreen><verb> +/* + * linux/include/linux/ext2_fs.h + * + * Copyright (C) 1992, 1993, 1994, 1995 + * Remy Card (card@masi.ibp.fr) + * Laboratoire MASI - Institut Blaise Pascal + * Universite Pierre et Marie Curie (Paris VI) + * + * from + * + * linux/include/linux/minix_fs.h + * + * Copyright (C) 1991, 1992 Linus Torvalds + */ + +</verb></tscreen> + +<sect>Acknowledgments +<p> + +EXT2ED was constructed as a student project in the software +laboratory of the faculty of electrical-engineering in the +<tt>Technion - Israel's institute of technology</>. + +At first, I would like to thank <tt>Avner Lottem</> and <tt>Doctor Ilana +David</> for their interest and assistance in this project. + +I would also like to thank the following people, who were involved in the +design and implementation of the ext2 filesystem kernel code and support +utilities: +<itemize> +<item> <tt>Remy Card</> + + Who designed, implemented and maintains the ext2 filesystem kernel + code, and some of the ext2 utilities. Remy Card is also the author + of several helpful slides concerning the ext2 filesystem. + Specifically, he is the author of <tt>File Management in the Linux + Kernel</> and of <tt>The Second Extended File System - Current State, + Future Development</>. + +<item> <tt>Wayne Davison</> + + Who designed the ext2 filesystem. +<item> <tt>Stephen Tweedie</> + + Who helped designing the ext2 filesystem kernel code and wrote the + slides <tt>Optimizations in File Systems</>. +<item> <tt>Theodore Ts'o</> + + Who is the author of several ext2 utilities and of the ext2 library + <tt>libext2fs</> (which I didn't use, simply because I didn't know + it exists when I started to work on my project). +</itemize> + +Lastly, I would like to thank, of-course, <tt>Linus Torvalds</> and the +<tt>Linux community</> for providing all of us with such a great operating +system. + +Please contact me in a case of bug report, suggestions, or just about +anything concerning EXT2ED. + +Enjoy, + +Gadi Oxman <tgud@tochnapc2.technion.ac.il> + +Haifa, August 95 +</article>
\ No newline at end of file |