[annodex-dev] What is metadata?

Conrad Parker conrad at metadecks.org
Mon Feb 11 03:05:25 PST 2008


The word "metadata" gets used a lot, to the extent that it's nearly
meaningless. In Annodex, we've got metadata in CMML, in Skeleton, and
in vorbiscomments. Sometimes we try to distinguish textual content
(like <desc>, the timed-text descriptions in CMML) from metadata (like
<meta> tags).

Steve Yegge clarifies this pretty well:

| Metadata is any kind of description or model of something else. ...  What
| makes metadata meta-data is that it's not strictly necessary. If I have a dog
| with some pedigree paperwork, and I lose the paperwork, I still have
a perfectly
| valid dog.
  -- Steve Yegge, "Portrait of a N00b"
     http://steve-yegge.blogspot.com/2008/02/portrait-of-n00b.html

Of course, necessary and useful are different things: it's useful to
tag the language of an audio track, but it's not necessary for
decoding it. At a higher level, in a player UI or on a server, that
information becomes necessary for choosing one of many language
tracks. So, deciding whether or not something is "metadata" depends on
the location of the observer. In any discussion about data, its
important to understand the needs of the participants: archivist,
server, player, viewer.

The point of having multiple ways to represent metadata is to allow
different machines and people to process them without interference.
Broadly speaking, Skeleton contains whole-track metadata for the
machine such as seeking hints and information for language-selection,
whereas CMML contains timed metadata for humans, such as locations and
names of people. To the archivist it's all important data, and the
discussion of "what is metadata" comes full circle.

Conrad.


More information about the annodex-dev mailing list