MORE IN Data Warehouse & Mining
MU Computer Engineering (Semester 8)
Data Warehouse & Mining
May 2013
Total marks: --
Total time: --
INSTRUCTIONS
(1) Assume appropriate data and state your reasons
(2) Marks are given to the right of every question
(3) Draw neat diagrams wherever necessary

1 (a) Differences between Data warehouse and Data mart
5 M
1 (b) For a Supermarket Chain consider the following dimensions, namely Product, store, time, promotion. The schema contains a central fact table, sales facts with measures unit_sales, dollars_sales and dollar_cost. Design STAR schema example: supermarket.
5 M
1 (c) Calculate the maximum number of base fact table records for warehouse with the following values given below:
- Time period: 5 years
- Store: 300 stores reporting daily sales
- Product: 40,000 products in each store (about 4000 sell in each store daily)
5 M
1 (d) Illustrate how the supermarket can use clustering methods to improve sales.
5 M

Define the following terms by giving examples:-
2 (a) Factless fact tables
5 M
2 (b) Snowflake schema
5 M
2 (c) Web Structure Mining
5 M
2 (d) Concept Hierarchy
5 M

3 (a) Apply Agglomerative Hierarchical Clustering and draw single Link and average Link dendrogram for the following distance matrix.

 A B C D E A 0 2 6 10 9 B 2 0 3 9 8 C 6 3 0 7 5 D 10 9 7 0 4 E 9 8 5 4 0
10 M
3 (b) Explain Page Rank technique with algorithm.
10 M

4 (a) Consider a data warehouse for a hospital, where there are three dimensions (1) Doctor (2) Patient (3) Time
and two measures (1) Count & (2) Fees
For this example create a OLAP cube and describe the following OLAP operations:
(1) Slice (2) Dice (3) Rollup (4) Drill Down (5) Pivot
10 M
4 (b) Consider the following transaction database:
 TID Items 01 A,B,C,D 02 A,B,C,D,E,G 03 A,C,G,H,K 04 B,C,D,E,K 05 D,E,F,H,L 06 A,B,C,D,L 07 B,I,E,K,L 08 A,B,D,E,K 09 A,E,F,H,L 10 B,C,D,F

Apply the Apriori algorithm with minimum support of 30% and minimum confidence of 70%, and find all the association rules in the data set.
10 M

5 (a) A simple example from the stock market involving only discrete ranges has Profit as categorically attribute, with values {up, down} and the training data is

 AGE COMPETITION TYPE PROFIT Old Yes Software Down Old No Software Down Old No Hardware Down Mid Yes Software Down Mid Yes Hardware Down Mid No Hardware Up Mid No Software Up New Yes Software Up New No Hardware Up New No Software Up

Apply decision tree algorithm and show the generated rules.
10 M
5 (b) What is meant by ETL? Explain the ETL process in detail.
10 M

6 (a) Define multidimensional and multilevel association mining.
10 M
6 (b) Explain role for Meta data in Data Warehouse.
10 M

Write detailed notes on:-
7 (a) Data Warehouse Architecture.
10 M
7 (b) K-Means Clustering.
10 M

More question papers from Data Warehouse & Mining