Monday, 9 February 2015

Cost Model

Cost Model:

Cost  Model that allows us to estimate the cost  (in terms of  execution time)  of different database operations.


We use the following short terms:

B denotes  the  number  of  data  pages  when  records  are  packed  onto  pages
with no wasted space
R denotes the number of records per page.
D the average time to read or write a disk page
C the  average  time  to  process  a  record  (e.g.,  to  compare  a  field  value  to  a selection constant).
H the time required to apply the hash function to a record.
F the fan-out  (average  number  of  children  for  a  non-leaf  node),  which typically is at least 100


Typical values today are:
D = 15 milliseconds,
C and H = 100 nanoseconds;
We therefore expect the cost of I/O to dominate.
I/O  is  often  the  dominant  component  of  the  cost  of  database  operations,  and  so considering I/O costs gives us a good first approximation to the true costs.
Further,  CPU  speeds  are  steadily  rising,  whereas  disk  speeds  are  not  increasing   at  a similar pace.


Bear the following observations in mind:
  • Real  systems  must  consider  other  aspects  of  cost,  such  as  CPU  costs  (and network transmission costs in a distributed database).
  • Even  with  our  decision  to  focus  on  I/O  costs,  an  accurate  model  would  be  too complex for our purposes of conveying the essential ideas in a simple  way.
  • We  therefore  use  a simplistic  model  in  which  we  just  count  the  number  of pages read from or written to disk as a measure of I/O.
  • We  ignore  the  important  issue  of  blocked  access  in  our  analysis-typically,  disk systems allow us to read a block of contiguous pages in a single I/O request.
  • The  cost  is  equal  to  the  time  required  to  seek  the  first  page  in  the  block  and transfer all pages

No comments:

Post a Comment