MU Data Mining & Business Intelligence - May 2015 Exam Question Paper

MU Information Technology (Semester 6)
Data Mining & Business Intelligence
May 2015

Total marks: --
Total time: --

INSTRUCTIONS
(1) Assume appropriate data and state your reasons
(2) Marks are given to the right of every question
(3) Draw neat diagrams wherever necessary

1 (a) Describe the different types of attributes one may come across in a data mining data set with two examples of each type.

5 M

1 (b) Explain the different distance measures that can be used to compute distance between two clusters.

5 M

1 (c) Define "Business Intelligence" and "Support System" with examples.

5 M

1 (d) Define "Outlier". What are the different types of Outliers that occur in dataset?

5 M

2 (a) Consider the following data points: 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.
i) What is the mean of the data? What is the median?
ii) What is the mode of the data?
iii) What is the mid-range of the data?
iv) Can you find (roughly) the first quartile (Q1) and the third quartile (Q3) of the data?
v) Show a box plot of the data.

10 M

2 (b) Design a BI system for fraud detection. Describe all the steps from Data collection to Decision Making clearly.

10 M

3 (a) Illustrate any one classification technique for the above data set. Show how we can classify a new tuple. With (Homeowner=Yes; status=Employed; Income=Average).

Id	Homeowner	Status	Income	Defaulted
1	Yes	Employed	High	No
2	No	Business	Average	No
3	No	Employed	Low	No
4	Yes	Business	HIgh	No
5	No	Unemployed	Average	Yes
6	No	Business	Low	No
7	Yes	Unemployed	High	No
8	No	Employed	Average	Yes
9	No	Business	Low	No
10	No	Employed	Average	Yes

10 M

3 (b) Why is Data Preprocessing required? Explain the different steps involved in Data Preprocessing.

10 M

4 (a) Use K-means to cluster the following data set into 3 clusters.

Protein	20	21	15	22	20	25	26	20	18	20
Fat	9	9	7	17	8	12	14	9	9	9

10 M

4 (b) Describe the different visualization techniques that can be used in Data Mining.

10 M

5 (a) Consider the following transaction database:

TID	Items
01	A,B,C,D
02	A,B,C,D,E,G
03	A,C,G,H,K
04	B,C,D,E,K
05	D,E,F,H,L
06	A,B,C,D,L
07	B,I,E,K,L
08	A,B,D,E,LK
09	A,E,E,H,L
10	B,C,D,F

Apply the Apriori algorithm with minimum support of 30% and minimum confidence of 70% and find all the association rules in the data set.

10 M

5 (b) Explain different methods that can be used evaluate and compare the accuracy of different classification algorithms.

10 M

6 (a) DBSCAN clustering algorithm with an example.

10 M

6 (b) Multilevel and Multidimensional Association rules.

10 M

More question papers from Data Mining & Business Intelligence

SPONSORED ADVERTISEMENTS