RECALL/PRECISION DATABASE SEARCHING CONTEST


This contest challenges you to achieve both high recall and high precision in doing the following searches on the Ontap Eric database. Ontap Eric, file 201, is designed for online training and practice and has the unique feature of answer sets for each search. Therefore, you can compare your retrieval set (this is the set that you believe has all the relevant records) with the answer set (this is the set determined by librarians and subject specialists to possess all the records that are really appropriate).

ONTAP ERIC (file 201)
Confer with the Chronolog of October 1980:
The Questions The questions represent real queries asked at reference desks in several libraries. A group of librarians was given the questions with instructions to search each questions aiming for the highest possible recall. Offline prints of each librarian's search were obtained and each record was examined for relevance.
The Answer Sets The relevant records for each questions was designated as the answer set. This means that the answer set is composed of records retrieved by several search strategies for the same question.


RULES OF THE CONTEST:


THE SEARCHES

Search Number

Search Topic

s03

4-H clubs, their members and their activities

s04

Revision of the Anglo-American cataloging rules

s05

Navaho language textbooks or grammars (material written in the Navaho language, or useful for teaching the Navaho language or about Navaho linguistics)

s06

Education in Sri Lanka (including library activities)

s08

16 personality factor test

s09

parapsychology

m01

Direct charging to users for reference and current awareness service of libraries and other information service agencies (philosophy, policy, practice, fees, charges; for any type of library, for any type of reference service; not interested in free services)

m05

Library service to the physically handicapped (not mentally or language-handicapped)

m06

Effects of TV violence on children

m08

Use of school busing to achieve racial integration

m09

Recreational use of forest lands

d02

Audiovisual aids for orientation or instruction of library users.

d03

Evaluation of primary school (grades K-3) English reading programs or reading materials and techniques (but not the evaluation of specific reading tests or instructors, and not the student test scores when they are not being used as part of an evaluation of the reading program and not just the criteria or standards for evaluation). Limit the output to publications available from ERIC/EDRS.

d05

Vocational education of the American Indian (history, data and programs to provide this education; but not training materials to be used in these education programs).

d06

Evaluation of bilingual elementary (grades K-8) and secondary (grades 9-12) school programs and techniques, specifically those that involve both Spanish and English languages.


How to figure your recall and precision:


After settling on what you will call your retrieval set (RS), ask for the appropriate answer set (AS) by issuing the command:
? s an=s02
[note: in this example we are asking for the answer set for question 02. Tailor this command for the particular question you are investigating.]
Combine your RS and the AS to produce the CS:
? s s12 and s13
Now quickly note the record counts of your RS, AS and CS sets. Below are directions for isolating your trash and misses depending on the relative sizes of RS, AS and CS sets.

ISOLATING TRASH AND MISSES:


1. Your retrieval set is larger than the answer set:
eg: RS = 100 and AS = 50
When you combine these two together, one of two things can happen...

ONE: CS = 50
You captured the answer set but in doing so also brought along some trash. To view the trash, use


TWO: CS < 50
Unfortunately you missed some of the answer set, and you also brought along some trash. To view your misses, use


And to view your trash, use


2. Your retrieval set is smaller than the answer set:
eg: RS = 25 and AS = 50
When you combine these two together, one of two things can happen...

ONE: CS = 25 Everything you retrieved was relevant, but you missed some things in the answer set. To view your misses, use


TWO: CS < 25 You missed some of the answer set, but did capture some trash. To view your trash, use


To view your misses, use


3. Your retrieval and answer sets are the same size!!!
eg: RS = 50 and AS = 50
When you combine them together, one of two things can happen...

ONE: CS = 50
Congratulations! You did a perfect search.

TWO: CS < 50
Hold the celebration. You have both trash and misses. To view the trash, use


To view misses, use

 

HOW TO CALCULATE YOUR RECALL AND PRECISION


After you log off, and with printout in hand, you can calculate your recall and precision:

Recall: It is the percent of relevant records existing in the database that you were able to return in your search:
(CS/AS) times 100 = percent RECALL

Precision: It is the percent of the retrieval set that is really relevant.
(CS/RS) times 100 = percent PRECISION