Ibm speech to text prototype

1/17/2024

Ibm speech to text prototype

Read Now

This entry was posted on Wednesday, September 12th, 2012 at 12:45 am and is filed under ibm, tech. And the UIMA pipelines are real and coordinating the work, demonstrating that the architecture is sound. However, there are real example implementations of each of the concepts and ideas in the vision, showing that the basic idea has potential and that these technologies are a good fit for this challenge. It’s not yet truly general purpose, and many of the key components are hard-coded or stubbed out. This is not a finished, fully-implemented product. Once analysed, the system is ready to react to commands about the web page. Each of the annotators in each stage of analysis contributes it’s own metadata, building up our picture of what the page contains. As elements in a pipeline, contributing a strategy to a collection of results, they’re each useful indicators. But they’re not stand-alone applications. It goes through a pipeline made up of several annotators, each of which is looking for something different in the page.īy itself, no one of these strategies are fool-proof. The contents of the web page – both the text contents, and the DOM structure of the HTML markup, including any semantic tags used – go through a UIMA analysis engine. When the user visits a webpage, the system needs to try and interpret it. Interpreting the web page that the user has just visited.Some use machine learning (using IBM’s SPSS Modeler) to learn from patterns in trends in the way web pages are structured and organised. Some of these strategies include using natural language processing (using IBM’s LanguageWare) to interpret meaning behind text. It is made up of pipelines of discrete annotators, each using a different strategy to try and identify meaning behind areas of the web page the user is viewing. Our prototype is built using a UIMA architecture. It can use this to provide a speech response with the information that the user wanted.Ĭommands can be typed, or provided as speech input.Ī video of some of the capabilities of the prototype In response to commands, it can navigate around them, and identify areas of the page, and the types of text and media. It is a proof-of-concept that can interpret many websites. We’ve developed a prototype that shows what the Conversational Internet could be like. I want to talk about Extreme Blue, but I’ll save that for another post. It was accepted, and this summer I’ve mentored a group of interns who did amazing work on turning the vision into a prototype. Earlier this year, I proposed it as an Extreme Blue project. I was convinced that we could do something amazing here. This idea came from a meeting that I wrote about last year with the RLSB : a charity for the blind. The vision is software that can interpret web pages and offer a conversational interface to web browsing.

If it understood more about what the user wants to do, the user could just say that, instead of working out the manual navigation steps themselves. It could read out the relevant bits, instead of everything. If it understood more about the web page, it could describe it at a higher, task-oriented level. But by the time you get to the end, you can’t remember whether the closest match was the third or fourth, or fiftieth option. Imagine none of the options are an exact match for what you want. Imagine this menu was so long it took 15 minutes or more to read. “for blah-blah-blah, press 1, for blather-blather-blather, press 2, for something-or-other, press 3 … for something-else-vague, press 9 …â€ The user has to make a mental model of the structure of the page as it’s read out, and keep this in their head as they arrow-key around the page.įor example, on a news site’s front page, once the screen reader has read out the page, you have to remember if the story you want is the fifth or sixth story in the list so you can tab the right number of times to get to it. Visually impaired people can interact with the web using screen readers. I’ve written about this before but here is a recap. I wrote yesterday about what it was like going to the BBC to talk about a project I’ve been working on this summer. It’s a proof-of-concept, but it’s an exciting idea with a lot of potential and we’ve got a demo that shows it in action. One that can offer a conversational interface instead of just reading out everything on the page.A screen reader, but one that uses machine learning and natural language processing, in order to better understand both what the user wants to do, and what the web page says.We’ve built a prototype to show how we could interact with the Internet using a command-driven approach.

0 Comments

Ibm speech to text prototype

Leave a Reply.

Author

Archives

Categories