Monday, 9 February 2015

Sorted Files

Sorted Files


Scan:

  • The cost is B(D +RC) because all pages must be examined.
  • Note  that  this  case  is  no  better  or  worse  than  the  case  of  unordered  files.
  • However, the order in which records are retrieved corresponds to the sort order,that is, all records in age order, and for a given age, by sal order.


Search with Equality Selection:
  • We assume that the equality selection matches the sort order (age, sal).
  • In other words, we assume that a selection condition is specified on at least the first field in the composite key (e.g., age =  30).
  • If not (e.g., selection sal = 50 or department = "Toy"), the sort order does not help us and the cost is identical to that for a heap file.
Cost (logB) if matches 1 and 2
Else it is same as SCAN

Search with range selection:

Again assuming that the range selection matches the composite key, the first record that satisfies the selection is located as for search with equality.

Subsequently, data pages are sequentially retrieved until a record is found that does not satisfy  the  range  selection;  this  is  similar  to  an  equality  search  with  many  qualifying records.
The cost is the cost of search plus the cost of retrieving the set of records that satisfy the search.

Insert:

To insert a record while preserving the sort order, we must first find the correct position in the file, add the record, and then fetch  and rewrite all  subsequent pages (because all the old records are shifted by one slot, assuming that the file has no empty slots).
On average, we can assume that the inserted record belongs in the middle of the file.
Therefore, we must read the latter half of the file and then write it back after adding the new record.

Delete:

We  must  search  for  the  record,  remove  the  record  from  the  page,  and  write  the modified page back.
We  must  also  read  and  write  all  subsequent pages  because  all  records  that  follow  the deleted record must be moved up to compact the free space.
The cost is the same as for an insert




No comments:

Post a Comment