[annodex-dev] captioning and cmml

Ralph Giles giles at xiph.org
Fri Dec 7 17:59:49 PST 2007


On Mon, Nov 19, 2007 at 03:00:33AM +1100, Silvia Pfeiffer wrote:

> I've just added a "caption" element to the CMML draft. Check it out here:
> http://trac.annodex.net/browser/standards/draft-pfeiffer-cmml-current.xml

How should multiple languages should be handled? Since translation 
is one of the most popular uses of captions, I think it's important to 
document and implement this early.

I can think of:

1) Separate CMML streams for each language. This has the advantage that 
the rest of the CMML can be translated as well, and the appropriate 
substream can be picked out from the skeleton entry. The disadvantage is 
it's a lot of overhead if you've got 30 languages that are just 
translations with the same in and out points. Woe betide the muxer that 
tries to sync them all after a seek?

2) Allow xml:lang as an attribute of the <caption> element (and others?) 
I notice the W3C timed text rec includes xml:lang in the schema, so 
it's not unprecedented. However, this means the render must show only 
the appropriately language-tagged elements, something the TT rec doesn't 
seem to address, and which definitely requires some specification 
on how to pick the right ones. This probably also means extra work for 
somethig like xml2po translation workflows. Advantages are lower 
overhead in the stream and all your annotations in one place.

I think (1) is the default from how our layers are structured. If we're 
worried about overhead maybe we need to support both?

 -r


More information about the annodex-dev mailing list