SPPU Information Technology (Semester 8)
Information Storage & Retrieval
May 2017
Solve any one question from Q.1(a,b) &Q.2(a,b)
1(a) Show how single link clusters may be derived from the dissimilarity coefficient by thresholding it.
5 M
1(b) You are developing a text processing system for use in an automatic retrieval System. Explain the following parts:
Removal of high frequency words.
Suffix stripping.
Detecting equivalent stems.
5 M

2(a) Find the similarity of following query with D1, D2, D3, Using vector model.
Query keywords  
q ant, dog  
document Text Terms
D1 ant ant  bee ant bee
D2 dog bee dog hog dog ant dog ant bee dog hog
D3 cat gnu dog eel  fox cat dog  eel fox  gnu
6 M
2(b) Write a short note on user oriented measures to evaluate the performance of the system.
4 M

Solve any one question from Q.3(a,b) &Q.4
3(a) Write a note on "Ontology based information sharing".
5 M
3(b) Explain the concept of hash addressing.
5 M

4(a) Consider a reference collection and its set of example information request. If q is the information request and a set Rq = (d3, d5, d9, d25, d39, d44, d50, d70, d80, d120). Now consider new retrieval algorithmhas been designed and has been evaluated for information request q returns, ranking of the documents in the answer set as.
1. d120
2. d84
3. d50
4. d6
5. d8
6. d9
7. d58
8. d129
9. d143
10. d25
11. d38
12. d48
13. d230
14. d113
15. d3

The documents that are relevant to the query q are underlined. Calculated precision and recall for the documents that are relevant to the query q.
10 M

Solve any one question from Q.5(a,b) & Q.6(a,b)
5(a) Describe the architecture of distributed IR.
8 M
5(b) What do you understand by multimedia query language? Explain various query predicates.
8 M

6(a) What are the issues in distributed IR computing? Write the techniques used to address these issues.
8 M
6(b) Write a note on MULTOS.
8 M

Solve any one question from Q.7(a,b) &Q.8(a,b)
7(a) Write a short note on web data mining.
8 M
7(b) What is page ranking? Calculate page rank of following web pages. Assume damping factor 7.0.
10 M

8(a) Explain centralized and distributed architecture of a search engine.
10 M
8(b) What is web crawling? Explain the techniques used by web crawlers to crawl the web
8 M

Solve any one question from Q.9(a,b) Q.10(a,b)
9(a) What is content based recommendation?
8 M
9(b) Explain semantic web in details.
8 M

10(a) Define Recommender system? Explain in brief collaborative filtering
8 M
10(b) Discuss trends and research issues involved in web
8 M

