operating-systems file-management

Definition

File Organisation

File organisation refers to the logical structure of the records within a file and the method used to access them. The choice of organisation depends on the nature of the application and the required access speed.

Common Organisations

Unstructured Sequence of Bytes

The simplest logical structure where the file is treated as a stream of bytes without any internal record structure.

  • Usage: Standard in Unix and modern Windows for most files; interpretation is left to the application.

Pile

The simplest form of organisation. Records are stored in the order they arrive (chronological).

  • Structure: Records can have variable lengths and a variable number of fields.
  • Access: Retrieval requires a linear search, which is slow.
  • Usage: Useful for log files or temporary data collection.

Sequential File

Records are stored in a fixed format and ordered based on a specific key field.

  • Structure: Fixed-length records with a fixed set of fields in a predetermined order.
  • Access: Efficient for processing the entire file in order. Searching for a specific record is faster than a pile (e.g., via binary search) but still requires multiple disk accesses.
  • Maintenance: Difficult to insert new records; often requires an overflow file and periodic reorganisation.

Indexed Sequential File

An enhancement of the sequential file that adds an index to support direct access.

  • Mechanism: The index contains the key and a pointer to the start of the corresponding block in the primary file.
  • Maintenance: Uses an overflow file for new insertions to maintain order without shifting all subsequent records immediately.

Indexed File

Unlike indexed sequential, an indexed file may have multiple indexes, one for each field that might be used as a search criterion.

  • Structure: The primary file records are not necessarily ordered. The indexes provide the logical ordering.
  • Usage: Common in database systems where high-speed lookup by multiple attributes is required.

Hash (Direct) File

Uses a hash function applied to the key field to determine the physical address of the record.

  • Access: Provides very fast direct access (ideally ).
  • Collision Handling: If two keys hash to the same location, an overflow file or chaining is used.
  • Usage: Ideal when direct access is the primary requirement and sequential processing is rare.