本次澳洲CS代写主要为C++分布式数据库和数据挖掘的assignment
The assignment
In this assignment you are required to implement the Bond Energy Algorithm of vertical frag-
mentation. Your code should contains two separate procedures AA Generator and CA Genera-
tor, where AA Generator takes the input of all attributes of a relation, a set of queries and their
access frequencies at dierent sites, and produces the output of an anity matrix AA, and CA
Generator takes input of an anity matrix AA and produces a clustered anity matrix CA. For
description of the BEA algorithm, denitions of AA and CA, please see lecture slides/textbook.
In this assignment, the Attribute Anity is measured by the extended Otsuka-Ochiai coef-
cient (https://en.wikipedia.org/wiki/Yanosuke Otsuka) instead of the traditional method de-
scribed in the textbook. The following equations show the details of the computation, where q
is the number of attributes, and m is the number of sites, Aik is the number of times Attribute
Ai is accessed by Query qk, considering of all sites. For the result of division, you must round it
up to the nearest integer. (Use DOUBLE ,instead of FLOAT ,during calculation ,may help
you get correct result)
Example
For AA Generator:
Input
• The relation, called PROJ, has the following features Ai:
Label Name
A1 PNO
A2 PNAME
A3 BUDGET
A4 LOC
• Queries (qi):
q1: SELECT BUDGET FROM PROJ WHERE PNO=Value
q2: SELECT PNAME, BUDGET FROM PROJ
q3: SELECT PNAME FROM PROJ WHERE LOC=Value
q4: SELECT SUM(BUDGET) FROM PROJ WHERE LOC=Value
• Access frequency matrix ACC, where Si denotes the i-th site:
S1 S2 S3
q1 15 20 10
q2 5 0 0
q3 25 25 25
q4 5 0 0
Output
• The attribute anity matrix AA:
A1 A2 A3 A4
A1 45 0 41 0
A2 0 71 1 71
A3 41 1 38 1
A4 0 71 1 71
For CA Generator:
Input
• The attribute anity matrix AA:
A1 A2 A3 A4
A1 45 0 41 0
A2 0 71 1 71
A3 41 1 38 1
A4 0 71 1 71
Output
• The attribute anity matrix CA:
A1 A3 A4 A2
A1 45 41 0 0
A3 41 38 1 1
A4 0 1 71 71
A2 0 1 71 71