MU Computer Engineering (Semester 8)
Data Warehouse & Mining
May 2012
Total marks: --
Total time: --
(1) Assume appropriate data and state your reasons
(2) Marks are given to the right of every question
(3) Draw neat diagrams wherever necessary

1 (a) Define a data warehouse. Explain what the need for developing a data Warehouse and hence explain its architecture.
10 M
1 (b) Compare OLTP and OLAP systems. Explain the steps in KDD with a suitable block diagram.
10 M

2 (a) What is meant by ETL? Explain the ETL process in detail.
10 M
2 (b) State and explain the various schemas used in data warehousing with examples for each of them
10 M

3 (a) Differentiate between top down and bottom-up approaches for building a data warehouse. Explain the advantages and disadvantages of each of them.
10 M
3 (b) Define what is meant by information package diagram. For recording the information requirements for "hotel occupancy" having dimensions like time, hotel etc, give the information package diagram for the same, also draw the star schema and snow flake schema.
10 M

4 (a) What is meant by Meta data? Explain with an example. Explain the different type of metadata stored in Data warehouse.
10 M
4 (b) What is meant by association rule mining for the example give below perform apriori algorithm. Also -
(1) Determine the k-item sets (frequent) obtained.
(2) Justify the strong association rule that has been determined i.e. specify which is the strangest rule obtained
The table is as follows -

TID Items
01 1,3,4,6
02 2,3,5,7
03 1,2,3,5,8
04 2,5,9,10
05 1,4
Assume minimum support 30% and minimum confidence 75%.
10 M

5 (a) Explain dimension modelling in detail.
10 M
5 (b) Explain what is meant by clustering? State and explain various types with suitable example.
10 M

6 (a) What is meant by classification? Justify why clustering is said to be supervised learning. How the classifier accuracy determined and also explain its various types.
10 M
6 (b) What is meant by Market Basket Analysis? Explain with example. State and explain with formula the meaning of the terms:-
(i) Support
(ii) Confidence
(iii) Iceberge queries
Hence explain how to mining multilevel association rules from transactional databases, with example of each.
10 M

Write short notes on (any two)
7 (a) OLAP operations
10 M
7 (b) Data Warehouse Deployment and maintenance
10 M
7 (c) Attribute oriented induction
10 M
7 (d) Web Mining
10 M

More question papers from Data Warehouse & Mining