Multi-format support and harmonisation

This topic describes the requirements to support multiple structured content formats. It also briefly considers a harmonisation approach that is supported by the current collaborators. It includes some thought pertaining the initial formats to be supported and why they are suggested but does not mandate specific formats.

In order to promote broad adoption it is important that this authoring and aggregation tool support multiple content formats rather than mandate a particular format as a requirement. Different communities will use different content formats depending upon their individual requirements or organisational affiliations and partnerships etc. It is important to note, however, that this application is not required to support unstructured content authoring (eg html) although there should be nothing to prevent a particular community or organisation from adding that functionality if it chooses to use a single application for multiple requirements.

One approach to support for multiple formats is to undertake an effort in harmonising the existing schemata that have been developed and form a "base" schema that incorporates most or all of the elements that have been developed within a set of XML schemata and then to locally extend the range of elements included by adding additional schemata that reflect local requirements. This is similar to the approach that has been taken by Connexions in relation to schemata such as MathML where it was more sensible to include work already done rather than attempt to redo large volumes of analysis and XML schema development.

There are several content formats that would be of interest within the OER communities and more broadly within education and training communities. These are listed below with some brief comments.

CNXML (Connexions XML)

Connexions Connexions is a very significant player in the OER space. It was also one of the few initiatives that carefully considered both the philosophical and technical implications of a "create, rip, mix and burn" approach to educational content. As a result, both the licensing and content format are designed to enable reuse and support for interoperability. Connexions produced their own structured content format, CNXML, in order to support the reuse objectives. Alongside the development of CNXML has also been the development of tools to convert commonly used content such as Microsoft Word into CNXML. The accumulation of modules in CNXML means that there is substantial value to be gained from supporting this format in the first stage.

eLML (University of Zurich eLesson Markup Language)

The eLesson Markup Language (eLML) was developed and is maintained at the University of Zurich. It emerged from the GITTA project and is the . eLML takes a narrative approach to content development (ie more like DocBook than DITA) and an early analysis has shown that there is a lot of commonality between CNXML and eLML. The extent of this alignment will be explored and analysed further. The eLML team have also developed a set of tools to support eLML authoring which may also be useful as input technology into this project.

WikiEducator format and tools

WikiEducator is powered by Mediawiki software, and uses wikitext as the native user interface. Recently, Wikieducator has deployed a Rich Text Editor which provides a WYSIWYG alternative to standard wiki markup. The use of Mediawiki software and limited interwiki capabilities facilitates import and export opportunities between large databases of free content administered by the Wikimedia Foundation and the OER Foundation who oversee WikiEducator. WikiEducator uses the a template scheme to add pedagogical structure to each page, very similar in intent to the iDevice concept of the eXe tool. Templates are used throughout WikiEducator to build simple structure into pages, and it is this structure that facilitates interesting reuse scenarios. A proof-of-concept tool that will take a collection of pages and bundle them into an IMS Content Package or IMS Common Cartridge for deployment into an LMS has been developed. Using the Collections extension for Mediawiki, it possible to produce pdf, Open Office (odt) format and Opendoc exports from collections of WikiEducator pages. Both the Open Office and Opendoc formats are open XML schemas. The OER Foundation is collaborating with Connexions on a project to improve remix potential between these respective platforms.

OUXML (Open University XML)

The Open University of the UK (OU) has made a range of courses available on the OpenLearn site. The content available on this site has been developed in, or converted to, the Open University's own XML format, OUXML. Content once ready for 'publishing' from this format is routinely converted into several publishing formats including IMS packaged content and Common Cartridges, print formats, and Moodle Backup files. The content is freely available for anyone to use. Insitutional support and commitment to produce OUXML content is very significant and means that the volume of this content will continue to increase. The OU, however, is also lacking a simple structured authoring and aggregation tool to support OUXML. This not only adds complexity to the OU developing its own materials but makes it more difficult for this format to be adopted more widely where it may provide greater value to other educational communities.

DITA (Darwin Information Typing Architecture)

The Organization for the Advancement of Structured Information Standards (OASIS) has a substantial portfolio of standards, several of which are structured content standards. The Darwin Information Typing Archtecture (DITA) was commenced as a simplified form of XML for technical documentation, although its origins are traceable well before that time. It passed to OASIS in 2004 and subsequently became an OASIS Standard in 2005. Version 1.2 of the standard is anticipated in late 2009.

Since its formalisation as a standard a significant and growing number of vendors have been including DITA support into their products and it has been adopted by many parts of the technical communication community and the companies for whom they work. More significantly in relation to this project is the specialisation capability within the architecture and the emergence of the Learning and Training Specialization for DITA content that will be a part of the next version of the standard. While there are growing numbers of tools for professional communicators, there is still an absence of simple tools for everyday users. A simplified approach to DITA would be a useful inclusion in this authoring and aggregation project.

There would most likely be limited expectations around the support for complex formats or entire content architectures such as DITA. It is more likely that DITA support would be limited to support for the learning and training specialization. Professional tools already exist for advanced requirements.

Other formats

Aside from these more common content formats there a many others that could benefit from the same simple authoring approach. In order to accommodate these requriements it is envisaged that the tool would have a 'plugin' approach to content formats so that individual communities or organisations would be able to add their own format and possibly limit others that are not for use within the community. This project will provide the architectural approach to accommodate formats but would plan to actually perform such tasks. That would be one area in which the developer community or specific communities of practice would actually take the lead. Additional formats may or may not form part of the base distribution of the application.

Consideration will be given to the best way to support specialized content formats (eg MathML, BibTexML). A range of issues will need to be resolved. Eg, Should this tool be able to author in such formats or is it better for such content to be developed in other specialized tools and simply provide 'awareness' in this tool? What functionality would be required to support the right approach for the application and its target users?

Along with supporting multiple authoring formats for the actual content, the editor will also support different aggregation formats. It is yet to be decided which aggregation formats would be appropriate if they are not already part of the overall structured content format.

Additional Points in Relation to Multiple Formats

Notwithstanding the ability of the application to accommodate multiple formats, the emphasis on simplified approaches to the interface and functionality so that users do not get confused or overwhelmed. This tool should provide different ways of working with content formats. In some situations it would be advantageous to have formats exposed to an individual for selection. In others it would be more useful to constrain this function so that the available formats could be limited within the context of a particular community or organisation. They may choose to create their own local distribution that only presents the content format(s) relevant to their situation. Such a community-based distribution may also include interface customisation to support an overall simplified approach.

Work in progress, expect frequent changes. Help and feedback is welcome. See discussion page.

Multi-format support and harmonisation

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Community

Print/export

Tools