MU Data Mining & Business Intelligence - May 2016 Exam Question Paper

MU Information Technology (Semester 6)
Data Mining & Business Intelligence
May 2016

Total marks: --
Total time: --

INSTRUCTIONS
(1) Assume appropriate data and state your reasons
(2) Marks are given to the right of every question
(3) Draw neat diagrams wherever necessary

1(a) Define 'Data Mining'. Enumerate five example applications that can benefit by using Data Mining.

5 M

1(b) What is Data Preprocessing? Explain the different methods for the Data Cleansing phase.

5 M

1(c) What is hierarchical clustering? Explain any two techniques for finding distance between the clusters in hierarchical clustering.

5 M

1(d) Explain the concept of a decision support system with the help of an example application.

5 M

2(a) Partition the given data into 4 bins using Equi-depth binning method and perform smoothing according to the following methods.
Smoothing by bin mean
Smoothing by bin median
Smoothing by bin boundaries.
Data: 11,13,13,15,15,16,19,20,20,20,21,21,22,23,24,30,40,45,45,45,71,72,73,75.

10 M

2(b) For the same set of data points in question 2.(a)
(a) Find Mean, Median and Mode.
(b) Show a boxplot of the data. Clearly indicating the five-number summary.

10 M

3(a) The table below shows a sample dataset of whether a customer responds to a survey or not. 'Outcome' is the class label.
Construct a Decision Tree Classifier for the dataset. For a new example (Rural, semidetached, low, No), what will be the predicated class label?

District	House Type	Income	Previous Customers	Outcome
Suburban	Detached	High	No	Nothing
Suburban	Detached	High	Yes	Nothing
Suburban	Detached	High	No	Responded
Urban	Semi- Detached	High	NO	Responded
Urban	Semi- Detached	Low	NO	Responded
Urban	Semi- Detached	Low	NO	NOthing
Rural	Semi- Detached	Low	Yes	Responded
Suburban	Terrace	High	NO	Nothing
Suburban	Semi- Detached	Low	NO	Responded
Urban	Terrace	Low	NO	Responded
Suburban	Terrace	Low	Yes	Responded
Rural	Terrace	High	Yes	Responded
Rural	Detached	Low	No	Responded
Urban	Terrace	High	Yes	Nothing

10 M

3(b) Briefly explain Bagging and Boosting of Classifiers

10 M

4(a) Use the Apriori to algorithm to identify the frequent item-sets in the folloeing database. Then extract the strong association rules from these sets.
Min. Support = 30% Min. Confidence=75%

TID	Items
01	A, B, D, E, F
02	B, C, E
03	A, B, D, E
04	A, B, C, E
05	A, B, C, D, E, F
06	B, C, D
07	A, B, D, E

10 M

4(b) Explain multidimensional and multi level Association rules with examples.

10 M

5(a) use any hierarchical clustering algorithm to cluster the following 8 example into 3 clusters:
A1=(2, 10), A2=(2, 5), A3=(8, 4), A4=(5, 8),
A5=(7, 5), A6(6, 4), A7=(1, 2), A8=(4, 9)

10 M

5(b) What is an outlier? Describe methods that can be used for outlier analysis.

10 M

6(a) Consider the following case study: An International chain of hotels wants to analysis and improve its performance using several performance indicators-quality of rooms, service facilities, check in, breakfast , popular time of visits, duration of stay etc.
For this case study design a B1 system, clearly explaining all steps from data collection to decision making.

10 M

6(b) Clearly explain the working of the DB_SCAN algorithm using appropriate diagrams.

10 M

More question papers from Data Mining & Business Intelligence

SPONSORED ADVERTISEMENTS