REMOVE DUPLICATES

Abbreviation:

RD

Command Format:

RD Sn
RD Sn FROM <file number>,<file number>, etc.

REMOVE DUPLICATES is the most frequently used duplicate detection command. The RD command creates a set of unique records, in which only one record from each set of duplicate citations is retained. The format for the command is

RD Sn RD entered without a set number defaults to the last set created.

When duplicate records are identified, records are chosen for retention based on the order in which the files were entered in the BEGIN command. For example, if the command BEGIN 154,72 is entered, records from File 154, MEDLINE are given priority over records from File 72, EMBASE. You can change this order of priority with the SET FILES command.

The records in the RD set are in accession number order. Since the SORT command only works with publication year (PY) and publication date (PD) in OneSearch, you can use the IDENTIFY DUPLICATES (ID) command to obtain a set that is approximately sorted by title (initial articles are ignored). Simply use the ID command with the set that resulted from the RD command, as shown in the second group of records in the following example.

Duplicates can be removed from a single file, as well as from multiple files. You can also use the RD command with the FROM option to remove duplicates FROM particular files (e.g., RD S3 FROM 6,8).

Records in a set created with the RD command can be used in later search statements.

Note: If you apply the RD command to a search that includes one or more files that do not offer duplicate detection, a system message notifies you of that fact. The system then processes the remaining files that do offer duplicate detection. All records from the unsupported file(s) are retained in the RD set. Enter the HELP DUP command online to display a list of files that do not offer duplicate detection.

To REMOVE DUPLICATES from a search done in OneSearch, using ID to SORT by title:

?show files;display sets 
File 154:MEDLINE(R) 1985-1997/Aug W3 
(c) format only 1997 Dialog Corporation 
File 72:EMBASE 1985-1997/Jun W4 
(c) 1997 Elsevier Science B.V. 
 
Set Items Description 
S1 643 ASPIRIN AND DIABET? 
S2 101 S1/ENG,1996:1997 
 
 Set Items Description 
--- ----- ----------- 
?rd s2 
...examined 50 records (50) 
...examined 50 records (100) 
...completed examining records 
S3 70 RD S2 (unique items) 
?type s3/6/all 
 
 3/6/1 (Item 1 from file: 154) 
09118702 97319150 
In experimental diabetes the decrease in the eye of lens carnitine levels 
is an early important and selective event. 
 
3/6/2 (Item 2 from file: 154) 
09109902 97243938 
Outcome of unstable angina in patients with diabetes mellitus. 
 
3/6/3 (Item 3 from file: 154) 
09100806 97211065 
Progression of distal symmetric polyneuropathy during diabetes mellitus: 
clinical, neurophysiological, haemorheological changes and self-rating 
scales of patients. 
. 
. 
. 
?id s3 
...examined 50 records (50) 
...completed examining records 
S4 70 ID S3 (sorted in duplicate order) 
?type s4/6/1-4 
 
 4/6/1 (Item 1 from file: 154) 
08896349 97077072 
AL0671, a new potassium channel opener, inhibits nonenzymatic glycation 
of protein and LDL oxidation. 
 
4/6/2 (Item 2 from file: 154) 
08889522 97124614 
An analysis of perioperative surgical mortality and morbidity in the 
asymptomatic carotid atherosclerosis study. ACAS Investigators. 
Asymptomatic Carotid Artheriosclerosis Study. 
 
4/6/3 (Item 3 from file: 154) 
08607141 96260029 
Anticoagulation: risks and benefits in atrial fibrillation. 
 
4/6/4 (Item 4 from file: 154) 
08825920 96430980 
Anticoagulation for atrial fibrillation: epidemiology informing a 

difficult clinical decision.

 

IDENTIFY DUPLICATES

Abbreviation:

ID

Command Format:

ID
ID Sn
ID Sn FROM <file number>,<file number>, etc.

The IDENTIFY DUPLICATES command can be used in single or multiple files to create a sorted set of records in which duplicates are grouped together. The ID command allows you to easily identify duplicate citations, while still retaining all of the records retrieved by your search. Unlike REMOVE DUPLICATES (RD), which automatically eliminates duplicate records from a set, the ID command does not remove records from your search results.

ID entered without a set number defaults to the last set created.

The ID command creates a set of records that have been approximately sorted by title. There are occasional variations to strict alphabetical order because duplicate detection takes into consideration alternate spellings, minor variations in titles, and leading articles, such as "the" and "a."

By displaying the ID set, you can decide which records to TYPE, DISPLAY, or PRINT. The SET FILES command can be used to change the order in which records are sorted in an ID set. You can also use the ID command on a set that has had the duplicates removed; this will sort the set alphabetically by title.

If you typically post-process your search results (e.g., format them into customized bibliographies with word-processing software), you can use the ID command to gather duplicate records and then combine them later into a single record that contains the best feature from each record, such as various editions of a book.

You can also use the ID command with the FROM option to group duplicates FROM particular files (e.g., ID S3 FROM 6,8).

Note: If you apply the ID command to a search that includes one or more files that do not offer duplicate detection, a system message will notify you of that fact. Dialog will then process the remaining files that do offer duplicate detection. Records from unsupported files will be retained in the ID set, but will be sorted to the bottom of the set. A list of files not offering duplicate detection can be obtained online by entering HELP DUP.

To IDENTIFY DUPLICATES while using OneSearch:

?b 72,154
 07jun98 15:56:00 User306002 Session D679.3
 
SYSTEM:OS - DIALOG OneSearch
 File 72:EMBASE 1985-1998/Jun W1
 (c) 1998 Elsevier Science B.V.
 File 154:MEDLINE(R) 1985-1998/Jul W4
 (c) format only 1998 Dialog Corporation
 
Set Items Description
 --- ----- -----------
?select aspirin and diabet?
 25274 ASPIRIN
 167271 DIABET?
 S1 758 ASPIRIN AND DIABET?
?s s1/1998
 758 S1
 186367 PY=1998
 S2 36 S1/1998
?id s2
...completed examining records
 S3 36 ID S2 (sorted in duplicate order)
?type s3/6/1-5
 
 3/6/1 (Item 1 from file: 72)
10717382 EMBASE No: 98143901
 Acute coronary syndromes in the United States and United Kingdom: A
comparison of approaches
 
 
 3/6/2 (Item 2 from file: 72)
10726687 EMBASE No: 98160346
 Acute myocardial infarction in Switzerland: Results from the PIMICS
myocardial infarction registry
 DER AKUTE MYOKARDINFARKT IN DER SCHWEIZ: RESULTATE AUS DEM PIMICS-
HERZINFARKT-REGISTER
 
 
 3/6/3 (Item 3 from file: 72)
10679857 EMBASE No: 98115246
 Anticoagulation to prevent stroke in atrial fibrillation and its
implications for managed care
 
 
 3/6/4 (Item 4 from file: 154)
09465755 98184483
 Anticoagulation to prevent stroke in atrial fibrillation and its
implications for managed care.
 
 
 3/6/5 (Item 5 from file: 154)
09493275 98227654
 Antioxidants diminish developmental damage induced by high glucose and
cyclooxygenase inhibitors in rat embryos in vitro.