CAEE Data Catalog Database

Last Updated: 23 January 2002, Emilio

A data & metadata catalog stored in a relational database would give us tremendous flexibility in delivering dynamic content, searching capabilities, and content browsing. Here is a message from Emilio explaining the goals and make-up of this data catalog database. I am working on it.

Here is draft data model, or database schema. This is a "relationship view" from Microsoft Access. I am prototyping the database in Microsoft Access because it's easy. But the final database would be a MySQL database. The image below shows the general structure but is too small to read. To see the details, see the data model in this PDF file. There are four main data entities or elements in this data model. These are described below, where the numbers in the list correspond to the red numbers shown on the diagram:

  1. data element file: a file or set of files representing one file or storage format of a particular data element (see below)
  2. data element: a data "unit" that may be made up of multiple files, such as a GIS shape file. It may include multiple sets of files for different formats, such as a GIS theme in two different formats together with their corresponding metadata. These are files that represent the same thing and are basically in the same structure.
  3. data set: a logical collection of data elements; for example, a GIS file and an associated table of information. Or GIS file of rainfall gages, and associated tables holding the rain data daily time series
  4. category: hiearchical thematic categories, such as environment > geology > tectonics, to organize the data sets

In addition to these data entities, there are several "look up tables" that hold the allowable values for a particular categories. For example, data license type, data steward, and file format. Also, tables that end in "_lang" allow us to store names of the last three main entities in different languages, for multilingual querying and displayl; the table "language" holds a list of the supported languages.