Jul 272011

Video delivered over the web is difficult.  Really difficult.  Firstly, like the “Browser Wars” depicted here:

Firefox and Chrome duke it out with IE and Safari

Firefox and Chrome duke it out with IE and Safari

there is now a war, or skirmishes, between Google, Adobe, Apple, Mozilla and lastly Microsoft.  Before Real’s realplayer and later Adobe’s Flash video (FLV) there were as many video formats as there were software vendors in the Microsoft ecosystem.  At the height of Real and Adobe’s success there was a problem in that the players and the video codecs were restricted to two (popular) platforms.  They were not accessible to everyone.  Adobe went further than Real to make Flash work in more places and video on the web has gained huge traction.  Apple to compete has worked hard with Quicktime and lately has concentrated on MP4 with h.264 video encoding.  MP4 is supported by Adobe’s Flash but Flash breaks the user experience on the web and breaks accessibility rules.  The answer to this would appear to HTML5 with its support for the <video> mark up.  Currently, there are two battles: the battle between Adobe’s Flash and the HTML5 way of doing things.  HTML5 will win.  Adobe is already converting it product to produce HTML5 rather than Flash and Flex.  The other battle which is a bigger problem in this context is the format war.

Many devices are capable of playing video.  There are three ‘popular’ platforms for video playback: Apple, Android and Microsoft that exist on the desktop and as mobile devices.  There are also all of the Linux devices (desktop, mobile and consumer devices like TV.) Apple (especially), Android and Microsoft are picky in how video should be packaged for playback with little overlap.  Linux is of course more forgiving.  There are, at least, 3 ways to package video (the video and the audio) and, at least, 6 codecs to use.  Mozilla would have us use the Ogg family of codecs because they believe that that are licence free and that they don’t impinge on other software patents.  Patents are not a problem in every jurisdiction but commercial organisations tend to work with what works in the USA.  This work would be easier without patents on software but here we are.  Microsoft are currently siding with Apple.  Apple will gain from monies collected when h.264 (video) is no longer free and their own audio codec tends to be default in MP4.  Microsoft don’t have their own codec.  Google have been flexible but recently bought On2 and have developed VP8 (video) as WebM.  Google are dropping support for h.264 in their ChromeOS.  Flash is run on Android right now.  One of the other factors is that hardware support has favoured h.264 video.  Other formats must be easier to decode (and therefore bigger in file size) on, what I call, compromised devices.  Later we will see hardware support for WebM depending on how the format war plays out.

What does this little lot mean to this project?  Mostly, it means compromise.  At some point in the future it will be easier to code video and HTML to deliver video from a streaming server but right now you either have to code for a restricted set of clients or right ‘clever’ code that inspects the web browser and delivers the correct video format in the correct mark up.  I found a project that is free to use Video JS that combines Javascript, browser interrogation and some document re-writing to present the correct video and HTML5 or Flash HTML code.  With three versions of the same video encoded for the lowest denominator of mobile devices the video should play back on nearly all recent devices.

That’s great but unfortunately, for this project, there is a further wrinkle.  The above works but you need a web server that can stream video and that is sympathetic to the client.  That it will allow the client to jump to portions of the video using its chosen method.  This works with Apache (with h264_streaming_module and some work arounds for Apple mobile clients) but we’re dealing with Apache’s Tomcat which doesn’t have the h264_streaming_module available to it and the DSpace code doesn’t have anything similar (that I know of).

The problem we want to solve is the ability to jump in the video to a specific part.  I took two approaches to solving this:

1. I looked at code that supports byte streaming.  Byte streaming allows ‘clever’ browsers to ask a web server for a chunk of a file from anywhere within the file.  This is the way forward and it’s likely that all web clients will support this.  The function is supported in Tomcat (Cocoon) but is switched off in DSpace because it broke a popular PDF client.  I switched it back on and hoped that we’ll find a work around for broken PDF readers.  I created PHP that is called by Apache when the video ends in .webm, .mp4, .ogg.  The PHP supports Byte Streaming.  This method doesn’t support Flash and so, breaks support for Android and old web browsers that only support Flash and not HTML5 Video.

2. Inspiration hit.  I thought I’d cracked the problem.  I know that I can, broadly, support video using Apache.  One way would be to create a video streaming server that uses VideoJS (h264_streaming_module and my work arounds for Apple) and store the video objects there.  This in practice would break the paradigm of a repository.  Only the URL pointing to the video would be kept in DORA there would be an administrative overhead should we change DORA; we would have to remember to change the streaming server too.  Inspiration came in the form of redirects.  I thought that, if I can create a symbolic link to the object (which has a meaningless name) on the Tomcat file system with a file name that has meaning to Apache I could get Tomcat to ask the browser to redirect in the same transaction handing the streaming responsibility to Apache.  The correct HTML is delivered to the browser.  The symbolic link is created if it hasn’t been created before and the browser is told to get the video from Apache instead.  The solution was quick to code but, and it’s a big but, where all web clients support redirects in a general way not all browsers support redirects within their video playback functionality.  This includes all Flash playback.

As it stands, we are using method 1 because if we don’t come back to it this is the method that should be supported later by newer browsers and mobile devices.  I would, given the chance, see if there is a streaming solution for Tomcat that supports Byte Streaming and Flash type streaming.

Jun 072011

We had a very useful meeting with Laurian from the RSP last week. It was helpful to discuss our progress and findings so far. Laurian said she will be blogging in detail about the meeting so I will post a link when she has.

It is now confirmed that the JISCrte projects will be having their mid-term meeting in Edinburgh in August. Alan will be attending – this is a useful opportunity to talk to the other projects in out funding stream as well as share our findings with people attending the Repository Fringe.

I will also be attending the Kultivate end of project event in July and giving information about EXPLORER. I look forward to some useful discussions.

Otherwise the work is progressing nicely. Technically we are looking at how to integrate non-text outputs better within DORA and incorporating the CERIF4REF tool which will enable DORA to link with the REF admission system when it arrives. An updated advocacy strategy and materials are being drafted to support the enhancements to DORA.

Mar 082011

Project AIR was concerned with the automatic and hands off population of OAI-PMH compliant Open Research Archives with linked data taken from institutional web sites .  A high aim indeed which is recognised and mollerated.  The tool crawls a web site to find research output for academics local to the institution.  Machine learning combined with a web crawler make up the meat of the project.  It is hard to extract linked data from a content management system where the content is at best prescribed and at worst free prose but either way not compatible with any system linking to a repository.  Another way to do this is by developing a culture around the submission of research outputs to the group responsible for the repository.  Cultures are very fluid things, they fade in and out of existence in different parts of an organisation and therefore a tool like this helps even if it doesn’t complete the effort.  The application of this tool to DORA will provide another way of adding outputs to the repository without manual input and repeated effort which is always of benefit to all users.

 Posted by at 9:16 am
Mar 082011

The Kultur project was funded by JISC. It was undertaken by Universities of Southampton, Arts London, Creative Arts and the Visual Arts Data Service. The project hoped to address the need to manage non-text outputs effectively. The primary aim was to establish a shared practice across the sector which DMU hopes to benefit from.  The project effected this, essentially, by changing the way the popular EPrints Open Research Archive deals with different media; images and video, by looking at workflow, metadata, preservation, curation, IPR and by looking at the way different types of users approach typically cold or sterile user interfaces of research archives or repositories.  Repositories should go well beyond just being used for REF outputs!

The project worked with core standards; Dublin Core for metadata and the Joint Academic Coding System JACS for its subject classification  but the project also deviated from standards in line with the users’ needs.  Two profiles were looked adding the profiles; Images Application Profile and Moving Images Application Profile to EPrints.  The Moving Images Application Profile wasn’t finished before the end of the project but work continues.  It’s likely that the code will need to be re-visited as HTML5 matures.

In the implementation of the project:

  • the range of research activity was captured
  • metadata was added to to reflect the new media’s attributes
  • real examples were found early in the project
  • provision was made for digital capture of some research outputs
  • a user survey was performed
  • features added : lightbox, slideshow (and video viewer)
  • more features: tabbed abstract pages, visual search results

As part of the EXPLORER project we are looking to apply what was done and learnt during KULTUR to our repository.  De Montfort University’s Open Research Archive is built using DSpace which is a very different animal to EPrints.  On initial inspection, DSpace has the hooks for code to treat objects, in the repository, with differing media types accordingly.  This would probably mean a different HTML markup for each media type which will then be handled by the CSS or Javascript to perform the magic.  The university’s research archive officer and technical lead have said that DSpace will support differing workflows and styles which can be selected by the user.

We will work either with the latest DSpace stable code or with the SVN.  We will also look for DSpace plugins that match our needs or can be adapted.  We will keep the Blog updated with our progress.