MU Computer Engineering (Semester 8)
Data Warehouse & Mining
December 2016
Total marks: --
Total time: --
(1) Assume appropriate data and state your reasons
(2) Marks are given to the right of every question
(3) Draw neat diagrams wherever necessary

1(a) Consider following dimensions for a Hypermarket chain: Product, Store, Time and Promotion. With respect to this business scenario, answer the following questions. Clearly state any reasonable assumptions you make. Design a star schema. Whether the star schema can be converted to snowflake schema? Justify your answer and draw snowflake schema for the data warehouse (clearly mention the Fact table (s), Dimension table (s), their attributes and measures).
10 M
1(b) Define linear, non-linear and multiple regressions. Plan a regression model for Disease development with respect to change in weather parameters.
10 M

2(a) What is meant by metadata in the context of a Data warehouse? Explain the different types of meta data stored in a data warehouse. Illustrate with a suitable example.
10 M
2(b) Describe the various functionalities of Data mining as a step in the process of Knowledge Discovery.
10 M

3(a) In what way ETL cycle can be used in typical data ware house, explain with suitable instance.
10 M
3(b) What is Clustering Technique? Discuss the Agglomerative algorithm with the following data and plot a Dendrogram using single link approach.The table below comprises sample data items indicating the distance between the elements.
Item E A C B D
E 0 1 2 2 3
A 1 0 2 5 3
C 2 2 0 1 6
B 2 5 1 0 3
D 3 3 6 3 0
10 M

4(a) Discuss how computations can be performed efficiently of data cubes.
10 M
4(b) A database has five transactions. Let min-support=60% and min-confidence=80%. Final all Find frequent item sets by using Apriori Algorithm. T_ID is the transaction ID.
T_ID Items bought
T-1000 M, O, N, K, E, Y
T-1001 D, O, N, K, E, Y
T-1002 M, A, K, E
T-1003 M, U, C, K, Y
T-1004 C, O, O, K, E
10 M

5(a) Differentiate
ii) Data Warehouse Vs.Data Mart
10 M
5(b) Why naive Bayesian classification is called "naive"? Briefly outline the major ideas of naive Bayesian classification.
10 M

Write a short note any four Q6.(i, ii, iii, iv, v)
6(i) Application of Data Mining to Financial Analysis
5 M
6(ii) Fact less Fact Table
5 M
6(iii) Indexing OLAP data
5 M
6(iv) Data Quality
5 M
6(v) Decision Tree based Classification Approach
5 M

More question papers from Data Warehouse & Mining