1(a)
Foe a Super market chain, consider the following dimensions namely product, store, time and promotion. The schema contains a central fact table for sales.
i. Design star schema for the above application.
ii. Calculate the maximum number of base fact table records for warehouse with the following values given below:
time period-5 years Store-300 stores reporting daily sales Product-40,000 products in each store (about 4000 sell in each store daily)
i. Design star schema for the above application.
ii. Calculate the maximum number of base fact table records for warehouse with the following values given below:
10 M
1(b)
Discuss:
i. The steps in KDD process
ii. The architecture of a typical DM system
i. The steps in KDD process
ii. The architecture of a typical DM system
10 M
2(a)
We would like to view sales data of a company with respect to three dimensions namely Location, Item and Time. Represent the sales data in the form of a 3-D data cube for the above and Perform Roll up, Drill down, Slice and Dice OLAP operations on the above data and Illustrate.
10 M
2(b)
A single example from the stock market involving only discrete ranges has profit as categorical attribute, with values {Up, Down} and the training data set is given below.
Apply decision tree algorithm and show the generated rules.
Age | Competition | Type | Profit |
Old | Yes | Software | Down |
Old | No | Software | Down |
Old | No | Hardware | Down |
Mid | Yes | Software | Down |
Mid | Yes | Hardware | Down |
Mid | No | Hardware | Up |
Mid | No | Software | Up |
New | Yes | Software | Up |
New | No | Hardware | Up |
New | No | Software | Up |
Apply decision tree algorithm and show the generated rules.
10 M
3(a)
Illustrate the architecture of a typical DW system. Differentiate DW and Data Mart.
10 M
3(b)
Discuss different steps involved in Data Preprocessing.
10 M
4(a)
Discuss various OLAP Models.
10 M
4(b)
Explain K-Means clustering algorithm? Apply K-Means algorithms for the following data set with two cluster. Data Set = {1, 2, 6, 7, 8, 10, 15, 17, 20}
10 M
5(a)
Describe the steps of ETL process.
10 M
5(b)
Discuss Association Rule Mining and Apriori Algorithm. Apply AR Mining to find all frequent item sets and association rules for the following dataset:
Minimum Support Count = 2
Minimum Confidence = 70%
Minimum Support Count = 2
Minimum Confidence = 70%
Transaction_ID | Items |
100 | 1, 2, 5 |
200 | 2, 4 |
300 | 2, 3 |
400 | 1, 2, 4 |
500 | 1, 3 |
600 | 1, 3 |
700 | 1, 3, 2, 5 |
800 | 1, 3 |
900 | 1, 2, 3 |
10 M
Write short notes on any four of the following
6(a)
Updates to Dimension tables
5 M
6(b)
Metrics for Evaluating Classifier Performance
5 M
6(c)
FP tree
5 M
6(d)
Multilevel & Multidimensional Association Rule
5 M
6(e)
Operational Vs. Decision Support System.
5 M
More question papers from Data Warehouse & Mining