“Finding and retrieving information is central to
libraries, and searching for specific information in large collections of
text—known as information retrieval—has long been of interest of computer
scientists. Until the
development of the web, browsing received less research effort, despite its
importance. Digital libraries
bring information retrieval and browsing together as the general problem of
information discovery—how to find information.” W.Y. Arms, Digital Libraries, p. 66
The focus of this course is on information retrieval in
both structured (i.e., databases on Dialog) and semi-structured (i.e., web
pages) information environments. We will pursue a two-fold strategy: (1)
How to maximize "relevant" retrieval in the mature, structured
information environment of Dialog, and (2) How to apply similar strategies
to the rapidly maturing information environment of the web.
Specific strategies include ranking, duplicate detection, finder databases
and natural language searching with Dialog's Target and Lexis/Nexis FreeStyle.
There is a Recall/Precision Database Searching context that challenges
students to employ clever searching strategies, and the analysis of a Gold
Standard Search that challenges students to match skills with the experts.
Each student will analyze a web tool.
Multi-media event: Film excerpt from
Saracevic, Mokros and Su: "Nature of Interaction Between Users and
Intermediaries in Online Searching."
establish (if you haven’t already) a personal website. On this website, please place a
link to your 528 assignments.
The work for this course consists of two graded assignments and two
assignments [Grade weight in brackets]
Homework [This work is not graded, but it must be completed.]
- Comparison of
Dialog's Target and Nexis' FreeStyle
- WWW Assignment
There is no final exam, midterm exam or quizzes.
Pick a Search Tool
A sophisticated information intermediary (read: Librarian)
must possess a familiarity with a large number of web tools. Our class is
going to be enriched by student presentations of web tools. Search Engine Watch is a
good source of information about web tools, as is listings of search
engines on Yahoo.
Every student should choose one search tool and (1) Present an analysis for
the class (see schedule below) and (2) Write a thorough analysis of the web
tool (this should be an html document).
Elements of a critique of a web tool:
Publicize your choose of web tool and select a
Gold Standard Search
- Dialog Journal Name
- Dialog BLUESHEETS 415
- Dialog Product Code
- Dialog Company Name
- Cooper, W.S. (1971)
"A Definition of Relevance for Information Retrieval"
Information Storage & Retrieval, 7, 19-37.
- Froehlich, T. J.
(1994) "Relevance Reconsidered--Towards an Agenda for the 21st
Century: Introduction to Special Topic Issue on Relevance
Research" JASIS, 45(3), April 1994
- Foskett, D.J.
(1972) "A Note on the Concept of 'Relevance'" Information
Storage & Retrieval, 8, 77-78.
- Kemp, D.A. (1974).
"Relevance, Pertinence and Information System Development"
Information Storage & Retrieval, 10, 37-47
- Salton, G. &
McGill, M. (1983) Introduction to Modern Information Retrieval.
- Swanson, D.R.
(1988) "Historical note: Information Retrieval and the Future of
an Illusion" JASIS, 39, 92-98.
- Harter, S. (1996)
"Variations in Relevance Assessments and the Measurement of
Retrieval Effectiveness" JASIS 47(1):37-49
- Belew, R. K. Finding out about: Search
engine technology from a cognitive perspective.
Target and Free-Style Searching
Interesting reading: "Measuring
Search-Engine Quality and Query Difficulty: Ranking with Target and
Freestyle" by Robert M. Losee and Lee Anne H. Paris. Journal of the
American Society for Information Science, 50(10):882-889, 1999. The
results suggest that slightly better subject-based retrieval performance is
obtained with best-case Boolean searching or the ranking engine used by
Freestyle when compared tothe ranking engine used by Target....there is
little difference between the two commercial search engines in terms of
performance....The research discussed here has been based on tests using
the CF dataset...However, fulltext systems containing entire
documents...can be expected to perform somewhat differently, and this study
provides only an approximation of the performance that would be obtained
with retrieving full documents using these particular commercial search
Database Searching Contest
- Most Specific Facet
- Building Block
- Citation Pearl
- Vigil, P.J. (1988)
"Search Strategy" [Chapter 5] from his book Online
Retrieval, Wiley 1988. See his Closed Loop Relevance Clustering
Algorithm, p. 103
- Saracevic, T.,
Kantor, P. (1988). "A Study of Information Seeking and
Retrieving. III. Searchers, Searches, and Overlap. JASIS, 39, 197-215.
- Saracevic, T.,
Mokros, H., Su, L. (1990). "Nature of Interaction Between Users
and Intermediaaries in Online Searching: A Qualitative Analysis.
Proceedings of the 53rd Annual Meeting of ASIS, (27) 47-54
- Bates, M.J. (1979).
Information Search Tactics. JASIS, 205-214.
- Fox, E., et al.
(1993) Users, User Interfaces and Objects: Envision, a Digital
Library. JASIS, 44(8), 480-491
- Harter, S. P. (1990).
Search Term Combinations and Retrieval Overlap: A Proposed Methodology
and Case Study. JASIS, 41, 132-146.