Monday, 27 February 2012

EPG for the Blind / Talking EPG / Speaking EPG (2006)

In late 2006, I reminisced about a very special interaction I had with a blind person who left quite an impression on my life at the time. Whilst I was still at university I used to work weekends and holidays for a major national clothing retail store, called Asmalls.  The receptionist for the head office; as well as personal assistant to the CEO, was manned by a lady who was completely blind. She operated a computer, answered the telephones, listened to the radio and read books all by herself, she even lived on her own and braved the walk to work by herself....she was absolutely amazing. She inspired me to raise the idea of a Talking TV, learn about accessibility to the point of becoming evangelical about it - so much so that I took the concept from idea to prototype, in under a year, all in my spare time - back in the years 2006/2007 when absolutely no other company or research body were publicly demonstrating this technology on Set-Top-Boxes.  Alas, what should've been my chance at innovating for humanity, was beaten by another company who partnered with the leading blind organisation in the UK and produced the world's first Talking TV product, 3-4 years later...

Again, why am I posting this now - almost 6 years later??
If you've been reading my other posts on Ideas, you'll know that I want to use my blog as a platform to make public the ideas I've had to date as well as my current ideas, in the hope that it will lead to making connections and other opportunities.  With this particular idea, I want to show you that I've done real work, with real organisations and people, and have what it takes to take something from a concept through to completion, despite the resistance from a corporate who saw no real monetizing value of the concept, although fully supported the idea as being socially beneficial.

In 2006 I set out on this project pet project to prove I could build a talking TV product, using existing set-top-box hardware and software, without the need for additional and complicated hardware.  I sought out real people to get the requirements, and iterated a few prototypes until I had a working demo of the concept running on a BSkyB platform, that is currently being powered by NDS software (the fact that NDS powers BSkyB products is already in the public domain).  Working outside normal work hours I built up relationships with other industry players and engaged in workshops.  I found out at the time that the RNIB was indeed engaged with OceanBlue on a similar project, but they shared their requirements with me in the spirit of collaboration.

What set my project apart was that I used an implementation that was completely based on software speech synthesis, based on an Open Source technology called Flite (Festival-Lite), kindly offered free for commercial use by the kind people at Carnegie Mellon University.  The other projects took a different approach of enhancing the physical hardware chipset for speech processing functions, adding cost to the set-top-box hardware, the quality of the sound was questionable at the time as well.  What eventually transpired was that the end product that was released was based on a commercial speech engine offering better quality voices...

In short I was able to showcase my demo to prospective customers, had a real working prototype built in 2007 - I've got a video that proves it as well.  My work was mentioned in press release in Jan 2009, but we didn't go public with any official demonstration of the technology. Instead, the PR noted that:

NDS is also researching new technologies that will help sight-impaired people

to access Electronic Program Guide information and other digital TV services.

One such NDS innovation enables people who are sight-impaired, and others who

have difficulty using text-based EPGs, to hear EPG listings and content read
aloud in real time in a natural sounding voice.

Getting my work mentioned in a press release felt great - my fifteen minutes of fame.  Alas, despite all my efforts of self promoting the project, it lacked the monetizing appeal and wasn't in-line with the product's roadmap and customer expectations at the time. People were more interested in home networking, advanced 3D graphics and animation, to be concerned about promoting accessibility. The feeling was that from a technology-provider point of view, it's easy to implement but there wasn't demand. The onus was more for operators and broadcasters, and depended on regulation. With little regulation to enforce this upon operators, this feature fell right to the bottom of the priority queue.  Even six years later, in 2012, despite headway being made in EU/UK/US, it still not a mandatory requirement to implement accessible EPGs using speech synthesis. It is rather an optional, much desired nice-to-have, and the market demand is small...So to cut a long-story short, my project was put on the back-burner, giving enough time for other companies to catch-up and make use of this niche space that had yet been unfulfilled - and so in 2009 instead Oceanblue went public in Sept 2009...

Take a look at this video:

Suffice to say, I took that a little personal because I felt we'd missed out on the chance of becoming the first company to announce to the world that STBs can become more accessible, my Speaking EPG project dies a quiet death!  I even pursued the patent routes years earlier, but there was enough prior art in the PC domain to warrant the concept void from patenting...

The options a typical technology provider faces
STB Middleware providers are usually focussed on providing compelling, exciting features and pay little regard to promoting accessibility. It is also evident in most of the advanced user interfaces being published today, full of flashy animation and graphics, using 3D images and video clips, paying little attention to the usability requirements, especially for the rather overlooked segment of the market, the visually impaired audiences. Instead, the focus is on maximising screen real-estate as offered by HD resolutions - typically usability / accessibility is added as an afterthought.  Why can't these providers learn from Apple who've designed their products with accessibility in mind from the get-go? The operating system does not feel clunky at all, enabling VoiceOver or any other accessibility feature is almost natural to the interface...If only Digital TV Middleware providers and EPG Application vendors could support this mindset... So some of the options (as shared by an ex-colleague Paul Jackson) available to these vendors are:
  • Do nothing except what customer's (operators) specifically request 
  • Assume that this is a niche product / service area, thus provide APIs etc as below to let third parties sort it all out, as you don't believe it can make enough money or produce enough innovation in this area 
  • Assume that this is a niche product / service area, but try to make money and encourage innovation from within, leaving room to change direction if it turns out not to be such a niche area as is assumed
  • Realise that easy to use accessibility features and functionality are a key part of good general product and service design and will have positive, possibly unexpected and potentially lucrative spin-offs in the general mass market if designed in from the outset, and choose to develop accessibility solutions first, then encourage mass market for them (top down approach)
  • Realise that easy to use accessibility features and functionality are a key part of good general product and service design and will have positive, possibly unexpected and potentially lucrative spin-offs in the general mass market if designed in from the outset, and choose to develop mass market accessibility related solutions first, then provide more niche / specialised accessibility solutions once volumes of suitably enabled devices are in the market (bottom up approach)
Current Real-world Products Promoting Accessible Talking TVs
The RNIB has a good summary of the current technology offerings to date - see Accessible TV Devices promoting the Smart Talk TV (from RNIB/OceanBlue partnership), Sky Talker (BSkyB's own proprietary standalone device) & Apple TV (promoting Apple's VoiceOver technology).

I can offer consultation on STB/TV Accessibility Topics & Technology
Based on my close experience and hands-on implementation of the technology itself, I can offer consultancy on accessibility issues around TV. This has become a personal passion of mine.  If you're looking to implement speech processing in software, looking to prototype and experiment with speech technology options, I can advise on the options available.  I am planning to write a white paper that describes this topic in fair amount of detail....

White Paper In Progress
In my upcoming white paper, I will introduce the need and demand for Speaking Programme Guides using data from a variety of sources, will explain the high level requirements from a user point of view, provide an industry update as well as benchmark the technology options available today. I will also highlight a template project that one can use based on Open Source technology....Please get in touch with me if you can't wait for this white paper....

Other Snippets of Industry activity proving I wasn't all that nuts

Proof of what the market is doing:

                                         P r o o f s     of       I d e a 
Original Idea Posted: 18 October 2006 EPG for the Blind

I have a friend who is blind - she is employed as a personal secretary(is able to email, fax, draft letters, etc), lives on her own, cooks her own food, travels to work alone, listens to music, reads books and is an amateur musician - which is quite amazing really.

She is able to do her work (and this is going back to Windows 3.1 days) by using specialised Microsoft product for the blind (hard-of-seeing?) - i.e. every key stroke is read out, etc. She is so good she can even touch-type without any braille on the keyboard...This is great for the PC, but what about TV, especially the EPG?

Certain channels broadcast events with audio descriptions - this is great. But what about the pressing the remote control, menu keys, menu options - grid, planner, etc- how is a blind person to navigate this? Also, how is one to find all the channels/events that have audio descriptions or even radio channels?

Basic idea: Embed audio as part of the navigation of the EPG, e.g. Pressing menu, would do something like read out ''Menu'' and as one navigates through the menu, each option in turn is read out...
That is only the beginning - the next step would be to embed audio description into the event information/synopsis.
And a further step would be to provide a personalised guide that is filter for the person's specific needs - and as she navigates through a grid for example - the content of the grid block is read out aloud...

Embedding built-in EPG audio samples/voice overs shouldn't be too much of a problem - a few K of memory probably - but with modern STBs this shouldn't cause a problem. Broadcasting audio description of events may add little to the BW overhead, but a few Kbps is likely...

Where is the money to be made? None - I hope - think of the service this would do to the community/society... 

Comments Received...
 An alternative approach which eliminates the cost of producing and transmitting spoken EPG data and synopsis information would be to simply enable the text highlighted / presented by navigation to be delivered as a data string out of a serial / USB port on the STB at the same time as it is rendered on the screen. External devices which are capable of text to speech are readily available (including as a standard feature of Microsoft OS) and offer multi language ''dictionaries''. The same methods can be extended to provide subtitle to speech capability.

One commercial downside is that it would be possible for third party devices to ''harvest'' this EPG and Synopsis data for other purposes. If this is considered to be a problem then the serial data could be encrypted using NDS solutions so only approved / paired devices could decrypt and text to speech these strings.
That is interesting information indeed - but on the humane side, I'm against add-on peripheral features (leading to bloat, just look at the X-Box as an example!) - if a system can be made to natively support a feature, then it may even be considered discriminatory - why should someone who's blind have to incur extra setup costs to experience TV compared to someone who isn't blind?

If the technology exists, it may well be worth considering implementing/porting this text-to-speech stack directly in a middleware itself...
Thinking about this a little more: The feature to support the blind, should not be obtrusive, invasive and disrupt the viewing of other abled members of the household. As such, it should be discreet and could possibly use the following:

- The middleware could provide separate accessibility features like an independent remote command set - that allows one to seamlessly browse the EPG data without disrupting the live viewing at all. The EPG information can be streamed wirelessly to a headset and thus the audio of the TV program currently being listened to (by others), is not disrupted. Even helpful audio description could be sent in this format.

So moving forward, possible solutions for an EPG for the blind include:
1. Embedding audio to each remote control command
2. Embedding audio into the TS stream
3. Using a third-part add on that does Text-to-Speech translation
4. Building a Text-To-Speech translator as a component of the middleware
5. Streaming this EPG info in the least disruptive way, e.g. wireless headset
6. The service provider to provide a separate channel (like info/help) dedicated to serving the needs of the blind (like commentary on what's on, etc - perhaps even through an interactive app?)
I just wanted to share with you some real life feedback of UK's Sky+ HD accessibility feedback. For the original post, you can read the Damon Rose's Blog.

Hi there.
Sadly the new Sky HD box isn't accessible.
The elements that I most want to use, and that I could use on their previous boxes, have now been re-designed in such a way that they're impossible to navigate.
So whereas I could previously go into the list of programmes I've recorded on Sky + PVR, arrow up and down and have a good idea where my programme is ... I now can't. Arrowing up and down no longer is confined to the 'programmes you've recorded' list ... it also takes the control left and right, in a manner which is hard to understand if you can't see. It goes into sub menus of 'series linked' programmes ... and takes you off round the screen if you arrow too far.
Very disappointing, a disappointment that I now spend 10 pounds a month extra on.

So one lesson to learn here is a well known one, but worth reminding ourselves for any development project that has to deal with a legacy of users: be careful to break features that people have become accustomed to, consider all your users no matter how small the market is, customers are paying for the service and are contributing to the bottom line...

Like all other improvements to software products, accessibility needs to be built in from the start, and not added as an afterthought--------

No comments:

Post a Comment