中文网站
  Advanced Search
Read the latest Blogs from IT professionals in the field. Read and write community created documents. Need IT help? Ask our staff. Connect with your peers. Check our Tech Shop for posters, books and software tools. Home

7.2 Database Systems: Decomposition

1. The previous example might seem to suggest that we should decompose schema as much as possible. Careless decomposition, however, may lead to another form of bad design.

2. Consider a design where Lending-schema is decomposed into two schemas

Branch-customer-schema = (bname, bcity, assets, cname)
Customer-loan-schema = (cname, loan#, amount)

3. We construct our new relations from lending by:

branch-customer = â…¡bname,bcity,assets,cname(lending)
customer-loan = â…¡cname;loan#,amount(lending)

4. It appears that we can reconstruct the lending relation by performing a natural join on the two new schemas.

bname bcity assets cname loan# amount

SFU

SFU

SFU

Downtown

Downtown

 

Burnaby

Burnaby

Burnaby

Vancouver

Vancouver

 

2M

2M

2M

8M

8M

 

Tom

Tom

Mary

Tom

Tom

 

 

L-10

L-50

L-20

L-10

L-50

 

 

10K

50K

15K

10K

50K

 

Figure 7.3: Join of the decomposed relations.

5. Figure 7.3 shows what we get by computing branch-customer 1 customer-loan.

6. We notice that there are tuples in branch-customer 1 customer-loan that are not in lending.

7. How did this happen?

  • The intersection of the two schemas is cname, so the natural join is made on the basis of equality in the cname.
  • If two lendings are for the same customer, there will be four tuples in the natural join.
  • Two of these tuples will be spurious - they will not appear in the original lending relation, and should not appear in the database.
  • Although we have more tuples in the join, we have less information.
  • Because of this, we call this a lossy or lossy-join decomposition.
  • A decomposition that is not lossy-join is called a lossless-join decomposition.
  • The only way we could make a connection between branch-customer and customer-loan was through cname.

8. When we decomposed Lending-schema into Branch-schema and Loan-info-schema, we will not have a similar problem. Why not?

Branch-schema = (bname, bcity, assets)
Branch-loan-schema = (bname, cname, loan#, amount)

  • The only way we could represent a relationship between tuples in the two relations is through bname.
  • This will not cause problems.
  • For a given branch name, there is exactly one assets value and branch city.

9. For a given branch name, there is exactly one assets value and exactly one bcity, whereas a similar statement associated with a loan depends on the customer, not on the amount of the loan (which is not unique).

10. We'll make a more formal definition of lossless-join:

  • Let R be a relation schema.
  • A set of relation schemas {R1,R2, ...,Rn} is a decomposition of R if
    R = R1 ∪ R2 ∪, , , ∪Rn
  • That is, every attribute in R appears in at least one Ri for 1 i n.
  • Let r be a relation on R, and let
    ri = â…¡Ri(r) for 1 i n
  • That is, {r1, r2,...rn} is the database that results from decomposing R into {R1,R2,...,Rn}.
  • It is always the case that:
    r ∈ r1 ∝ r2 ∝.. ∝ rn
  • To see why this is, consider a tuple t 2 r.
    - When we compute the relations {r1, r2, . . .,rn}, the tuple t gives rise to one tuple ti in each ri.
    - These n tuples combine together to regenerate t when we compute the natural join of the ri.
    - Thus every tuple in r appears in 1ni=1 ri.
  • However, in general,
    r ≠r1 ∝ r2 ∝, , , ∝ rn
  • We saw an example of this inequality in our decomposition of lending into branch-customer and customer-loan.
  • In order to have a lossless-join decomposition, we need to impose some constraints on the set of possible
    relations.
  • Let C represent a set of constraints on the database.
  • A decomposition {R1,R2,...,Rn} of a relation schema R is a lossless-join decomposition for R if,
    for all relations r on schema R that are legal under C:
    r = â…¡R1(r) ∝ . . . ∝ Rn(r)

11. In other words, a lossless-join decomposition is one in which, for any legal relation r, if we decompose r and then "recompose" r, we get what we started with - no more and no less.

Database System Structure: