Metadata development and documentation

From codata
Jump to: navigation, search

Metadata Development and Documentation

Metadata goals

  • Simple: Enable non-scientists to grasp the range, scope, and extent of data sources identifed as "at risk".
  • Broadly applicable: Cover essential properties for data sources identified as "at risk."
  • Extensible: Support metadata extensions over time.

DARTG Metadata scheme -- ver. 0.8 [draft]

Inventory facets (metadata properties, elements)


  1. Science area
  2. Nature of data
  3. Date or date-span
  4. Location of original
  5. Present location
  6. Expected future
  7. Risk level

--- Additional properties/factes that have been suggested:

  1. Extent (number of records, estimated volume)
  2. Data champion (name, institution/agency, email) <-- would this be for the location of where the data resides now?

--- some additional notes, documents, etc.:

  • Sample inventory uploaded (seen under Recent changes), See headers in this spreadsheet, additional attributes added, and original order (above) rearranged.


It would be great if DARTG members would help craft defintions, and give examples.--JaneGreenberg 22:16, 25 May 2011 (CEST)

  • Science area: The discipline name or descriptor
  • Nature of data: The type of data
  • Location of original: The location of..

it would be great if folks might consider submitting examples here, listing them and linking somewhere on the wiki, or...--JaneGreenberg 22:24, 25 May 2011 (CEST)

Notes and background information on metadata

Descriptive metadata

  • DCMI Metadata Terms: "...a specification of all metadata terms maintained by theDublin Core Metadata Initiative, including properties, vocabularyencoding schemes, syntax encoding schemes, and classes." (DCMI)
  • DataCite: "The DataCite Metadata Scheme is a list of core metadata properties chosen for the accurate and consistent identification of data for citation and retrieval purposes, along with recommended use instructions" (p. 3, Version 2.0, January 2011).
  • Dryad Metadata Application Profile, Version 3.0, August 2, 2010: "A DCMI Application Profile comprised of three separate modules (1. the publication -- the article associated with content in Dryad; 2. the data package -- the group of data files associated with a given publication; and 3. the data file -- the deposited bitstream." (More about Dryad)

Scientific disciplines and scope

  • Universal Decimal Classification (UDC): Scheme synopsis, including top-level classes.
  • Hierarchical Interface to LC Classification (HILCC), Columbia University
  • A note relating to discipline representation: There are a number of high-level bibliographic, government, and vendor schemes (e.g., ProQuest Codes). The problem is that most of these systems will fall short for representing the full scope of disciplines DARTG will encounter.
  • Another note relating to classification/metadata. My sense is that we will want to approach discipline in a faceted manner rather than emphasize hierarchies. Once you move down a hierarchy, it becomes increasinly difficult to represent //inter//, //multi//, and //transdiciplinary// topics. A person assigning topics within a hierarchy has to account for every pairing, whereas a faceted scheme is flat, and allows one to mix + match more easily. This situaiton is true regardless of classification or descriptive facets. I have 3 slides here (a beer classification showing the simplicity of facets--I hope!). The example illustrates how one has to do more digging, and inventing with hierarchies. We can discuss more as a team.