1 (a)
Differences between Data warehouse and Data mart
5 M
1 (b)
For a Supermarket Chain consider the following dimensions, namely Product, store, time, promotion. The schema contains a central fact table, sales facts with measures unit_sales, dollars_sales and dollar_cost. Design STAR schema example: supermarket.
5 M
1 (c)
Calculate the maximum number of base fact table records for warehouse with the following values given below:
- Time period: 5 years
- Store: 300 stores reporting daily sales
- Product: 40,000 products in each store (about 4000 sell in each store daily)
- Time period: 5 years
- Store: 300 stores reporting daily sales
- Product: 40,000 products in each store (about 4000 sell in each store daily)
5 M
1 (d)
Illustrate how the supermarket can use clustering methods to improve sales.
5 M
Define the following terms by giving examples:-
2 (a)
Factless fact tables
5 M
2 (b)
Snowflake schema
5 M
2 (c)
Web Structure Mining
5 M
2 (d)
Concept Hierarchy
5 M
3 (a)
Apply Agglomerative Hierarchical Clustering and draw single Link and average Link dendrogram for the following distance matrix.
A | B | C | D | E | |
A | 0 | 2 | 6 | 10 | 9 |
B | 2 | 0 | 3 | 9 | 8 |
C | 6 | 3 | 0 | 7 | 5 |
D | 10 | 9 | 7 | 0 | 4 |
E | 9 | 8 | 5 | 4 | 0 |
10 M
3 (b)
Explain Page Rank technique with algorithm.
10 M
4 (a)
Consider a data warehouse for a hospital, where there are three dimensions (1) Doctor (2) Patient (3) Time
and two measures (1) Count & (2) Fees
For this example create a OLAP cube and describe the following OLAP operations:
(1) Slice (2) Dice (3) Rollup (4) Drill Down (5) Pivot
and two measures (1) Count & (2) Fees
For this example create a OLAP cube and describe the following OLAP operations:
(1) Slice (2) Dice (3) Rollup (4) Drill Down (5) Pivot
10 M
4 (b)
Consider the following transaction database:
Apply the Apriori algorithm with minimum support of 30% and minimum confidence of 70%, and find all the association rules in the data set.
TID | Items |
01 | A,B,C,D |
02 | A,B,C,D,E,G |
03 | A,C,G,H,K |
04 | B,C,D,E,K |
05 | D,E,F,H,L |
06 | A,B,C,D,L |
07 | B,I,E,K,L |
08 | A,B,D,E,K |
09 | A,E,F,H,L |
10 | B,C,D,F |
Apply the Apriori algorithm with minimum support of 30% and minimum confidence of 70%, and find all the association rules in the data set.
10 M
5 (a)
A simple example from the stock market involving only discrete ranges has Profit as categorically attribute, with values {up, down} and the training data is
Apply decision tree algorithm and show the generated rules.
AGE | COMPETITION | TYPE | PROFIT |
Old | Yes | Software | Down |
Old | No | Software | Down |
Old | No | Hardware | Down |
Mid | Yes | Software | Down |
Mid | Yes | Hardware | Down |
Mid | No | Hardware | Up |
Mid | No | Software | Up |
New | Yes | Software | Up |
New | No | Hardware | Up |
New | No | Software | Up |
Apply decision tree algorithm and show the generated rules.
10 M
5 (b)
What is meant by ETL? Explain the ETL process in detail.
10 M
6 (a)
Define multidimensional and multilevel association mining.
10 M
6 (b)
Explain role for Meta data in Data Warehouse.
10 M
Write detailed notes on:-
7 (a)
Data Warehouse Architecture.
10 M
7 (b)
K-Means Clustering.
10 M
More question papers from Data Warehouse & Mining