Kodi Community Forum

Full Version: Xbmc not working for blind users.
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
(2014-04-28, 17:51)ruuk Wrote: [ -> ]Just so I understand, was this a fix in the code for the Pi sound specifically or in code that is used on all platforms.

It is Pi specific code that is not yet in master branch. It is in the branch that gotham openelec betas build from,
and I think gotham-raspbmc-release raspbmc builds also come from there. It was in this commit (which has now been fixed)
https://github.com/popcornmix/xbmc/commi...6e78532172

The gotham release builds of raspbmc and openelec will have the fix.
You are probably best testing with the next milhouse (openelec) or miappa (raspbmc) build that appears, and they should be fixed.

It looks like openelec has just picked up the fix:
https://github.com/OpenELEC/OpenELEC.tv/...0040L15193

so it should be in next official gotham openelec build.
(2014-04-28, 09:58)VIJO Wrote: [ -> ]LOL, I hope to get quite a few people using XBMC, So we will probably find out, lol. If its anything like the Google voice hole, could last a while. But doesn't' that cause a lot of delay sending the text to external server, converting the file to wav, then having xbmc play the wav? Just some thoughts. As for the Wav's crashing, Im confused, Is it that the Wav files are to large or to small? Can we have the TTS engine generate an alternate format? or Maybe better yet, having something outside XBMC play the wav.
If you read some of the later posts after yours, it seems to be related to duration rather than size, and popcornmix has added a fix.
While it is possible on the Pi to bypass XBMC for outputting the sound, most XBMC installations currently don't have alsa enabled by default on the Pi. So for users who aren't tech savvy enough to install alsa on RaspBMC (for OpenElec, I think it has to be compiled with it enabled, and it isn't by default) we have to misuse the playSFX() command in XBMC to output sound.
(2014-04-28, 09:58)VIJO Wrote: [ -> ]I am going to go ahead and purchase a PI, So ill give the the various speech and the Cepstral option a try let you all know. One thing I don't know about the Cepstral option is will it work on the Openelec linux version. But im going to try to get a hold of them and inquire into more about it.
Cepstral needs alsa to work, so with a current normal build OpenElec it won't work. What is nice (at least in this case) about Cepstral, is it installs via a script, so it should at least install fine on OpenElec. Of course that doesn't mean it will work, and I don't really want to spend $30 to find out Smile
(2014-04-28, 09:58)VIJO Wrote: [ -> ]Love the list count ideas, Just on of those things I didn't even realize I needed, LOL. Now that I have it, its like, Hmm, that was missing.
Yeah, I definitely need people to tell me these things. Even if I test with my eyes closed, I still have a visual memory of what the screens look like, so it isn't the same.
(2014-04-28, 09:58)VIJO Wrote: [ -> ]Ruuk, I understand your point about the xbmc team and troubles with addons and support. But I would still love to see this be as easy to install for folks who need a tts engine as it is for everyone else. My thought was maybe the team would be willing to support on some level even if just a little on the programming side of things maybe even find ways to help it along, I think what I am saying is this brings a whole new level of access for us many folks. Any time I see something as astoundingly amazing as this addon, one of my worries (And I hope not to much of my programming ignorance is showing here.) is for the future, I.E. I pray they don't close or change access to infolabels and break anything else that might drastically change the landscape. Yes I realize maybe my worry is little grandiose. But mind you there is also another side of the coin, maybe there are things they can do to help the addon as well. I mean you folks have not just created some random addon, You have actually opened up a whole new side to XBMC, From accessibility (Blind, dyslexic, reading impaired, and list goes on), to people who might just wish to run it headless for some reason or another i.e. an awesome music server. And isn't that a major ongoing goal of Team XBMC to paraphrase "to make XBMC and its user interface feel even more intuitive and user-friendly for its end-users" and "User-friendliness is next to godlyness", Well how does this not help with that goal in a seriously major way. I think what im saying is it would be cool if this were not just an addon, but would love to see it as an possible actual feature built into XBMC, You know like the 3d is now, lol. I apologize if i seem maybe a touch over passionate, lol.
Well, I'm not saying that it won't get included into XBMC ever, but I do think we are a while away from that. The addon certainly isn't ready now Smile

My thinking had been to do as much as possible with the addon, and when we've gone as far as we can, we'll know where we actually need changes in XBMC. If we can get those sorts of changes, then perhaps later we can get things that would improve the function of the addon - for example remove the need to parse XML to speak non-focusable controls (which is slow).

I don't think you need to worry about infoLabels and such going away, because they are an integral part of all the skins. Also, one of the great things about XBMC is that it is a open community project. It's not like closed source and paid software where your issues are ignored because they don't affect the top 95% of users. If some change is coming that will break something, we can work with the developers to find a solution.

Of course, on the flip side of the coin, adding speech directly into XBMC would be a major project. One of the benefits of the AppleTV was that they could decide from the start to have speech built in. This is certainly the best and easiest way to have speech in a program. XBMC (like most open source projects) was started by a tiny group who wanted to do a specific thing, and in their case that was to get a media player on the XBox. So at the start speech wasn't even possible, let alone on the minds of the developers. Also like most open source projects, XBMC has grown as developers saw a need (generally one that was relevant to them) and filled it.
My point is that this is why adding speech directly into XBMC now would be such a time consuming task. You would need at least one developer who was committed to making it happen, and currently there is no one, and we can't expect the current developers to devote a major amount of time to it, when they are already giving their time with whatever they are currently working on.
That being said, I have no doubt the will help if they can to make what we having going here work.

As XBMC was not built from the ground up with speech, I believe that an addon is definitely the right place for TTS. There is nothing wrong with that, much of what XBMC does is handled by addons. All the skins for XBMC are addons. In fact the push over the years has been to move whatever can be handled by addons into addons.
It is a testament to the forethought of the developers that this addon has gotten this far. Despite all the things I wish I could do that I can't, there is an amazing amount of power and flexibility with the addon system, and this will only improve as time goes on.

So for now you just have me and other are willing to help. Fortunately I seem to be willing to spend ungodly amounts of time on a project that I have no personal use for Smile Of course I write programs because I enjoy programming and learning new things. It also turns out that all the users of this project are friendly and helpful people, which really makes a big difference as well.

Anyway, I believe it fairly likely we can eventually get this addon included into XBMC, but it will take time to get to that point. I can guarantee it will take longer than you want Smile
In the shorter term, we can definitely get this into the official repository. We just need to get it fairly complete and stable.

I was making point about XBMC and supporting addons because this addon has a large potential for having support issues. If the addon fails to speak on startup for a non tech-savvy blind user, I assume it will be as if XBMC started up with black screen for sighted user. Normal addons just have issues and throw an error or whatever when something is wrong, and don't affect the ability of the user to continue to use XBMC. So with this addon we have to do a lot to make sure that doesn't happen as much as possible, and for inclusion into XBMC even more so, but that is ultimately my goal.

[END OF LONG WINDED RAMBLE]

(2014-04-28, 18:10)popcornmix Wrote: [ -> ]
(2014-04-28, 17:51)ruuk Wrote: [ -> ]Just so I understand, was this a fix in the code for the Pi sound specifically or in code that is used on all platforms.

It is Pi specific code that is not yet in master branch. It is in the branch that gotham openelec betas build from,
and I think gotham-raspbmc-release raspbmc builds also come from there. It was in this commit (which has now been fixed)
https://github.com/popcornmix/xbmc/commi...6e78532172

The gotham release builds of raspbmc and openelec will have the fix.
You are probably best testing with the next milhouse (openelec) or miappa (raspbmc) build that appears, and they should be fixed.

It looks like openelec has just picked up the fix:
https://github.com/OpenELEC/OpenELEC.tv/...0040L15193

so it should be in next official gotham openelec build.

Thanks for taking the time to look into and fix this.
eckythump: As you can see the crashing issue has ben fixed by popcornmix. I went through your previous posts and it looks like everything has been addressed. Let me know if I missed something I should answer. I'll have to see what I can do about the wav getting cutoff when I get a chance.
I've made a python HTTP wav server that will serve wavs using the same code as the addon backends. I managed to separate the backends code to a submodule, so that changes I make to the addon will be added to the server without the need for modification. Currently it can use Flite, eSpeak, and pico2wave on unixes and SAPI on windows (I used your perl code to figure it out) to serve wavs. It will also be able to speak the wavs on the server machine using any of the available backends for whatever use someone might have for that. I could imagine that being an easy way for a blind user to have the speech on headphones without others hearing it, for example. It's working pretty well now - there's just a few things I need to do and I'll post a download so you (or anyone) can try it out.
If you didn't already notice, I posted a link to the zip of your github repo for the perl server on the addon downloads page a few days ago.
I sent this message to the Cepstral people. We'll see how what they have to say Smile

Quote:I am the developer of an open source TTS addon for the open source XBMC Media Center. Using XBMC is one of the major reasons many people buy a Raspberry Pi. Some of the blind users of XBMC have expressed that they would be willing to pay for Cepstral if they could use it on the Raspberry Pi with XBMC. I have been able to implement this for linux and windows because I can use the the testing download to make sure it works. There is no testing download for the Pi and I can't really afford to pay for things that I am not going to use, but I would like to be able to make it possible for blind people to use it for XBMC on the Pi. I was wondering if there is any way you could help.

Thank you.
eckythump: I forgot to say, thanks for all the testing you've been doing!
(I forgot to respond to this)
(2014-04-28, 09:58)VIJO Wrote: [ -> ]LOL, I hope to get quite a few people using XBMC, So we will probably find out, lol. If its anything like the Google voice hole, could last a while. But doesn't' that cause a lot of delay sending the text to external server, converting the file to wav, then having xbmc play the wav?
Surprisingly no. At least over my internet connection it works great.
Well, Cepstral quickly got back to me. While I had hoped to get a license for free, at least they gave me link to the trial version. Maybe if I had called them I could have cried on the phone Smile
Quote:Rick,

Thank you for your interest in Cepstral TTS software. We offer the our Callie voice for use on the Raspberry Pi. You can download the trial version below. The price of a personal use license is $29.99

http://www.cepstral.com/downloads/backro...1.4.tar.gz

Patrick Dexter
@@
I will be getting Google fiber here within the next year. When I do Ill setup a windows speech server for, (Well Myself), But let everyone else use it as long as bandwidth load doesn't become an issue.

(2014-04-27, 16:44)eckythump Wrote: [ -> ]You probably want to look at the sources.xml file. On my system it is found at /storage/.xbmc/userdata/sources.xml

I set my video sources up once and forgot about them, so I am not sure how one would go about doing what you want via the GUI, but the sources.xml file lets you specify multiple sources for a given name, like so:
Code:
<source>
    <name>TV Shows</name>
    <path pathversion="1">/storage/tvshows/</path>
    <path pathversion="1">smb://HAPPYSAMBA/media/video/TV/</path>
</source>

This didn't change anything for me. For what its worth, i'm in windows. I'm not trying to change sources. In the Quartz menu, under setting it lets you disable and rename menus. So I disabled the Movies menu and renamed the TV Shows menu to videos. It shows Videos on the home screen, however, TTS reads TV Shows still. Where is the TTS reading from? I'm not trying to change any actual sources, just the name that reads on the home Screen.

Ruuk Wrote:Well, Cepstral quickly got back to me. While I had hoped to get a license for free, at least they gave me link to the trial version. Maybe if I had called them I could have cried on the phone Smile

Ruuk, You have a PM.
(2014-04-28, 23:31)VIJO Wrote: [ -> ]One of the things that really neat this addon is the shear amount of what it is able to read. I always thought that if we got speech in xbmc it would read the basic stuff and thats it. Kinda like speech in WMC. But no, this speech reads all and is compatible with soooo many options, you really are talented folks.

Anyways, Ruuk and others, You really are very talented developers.
That's what I keep telling my wife Smile
(2014-04-28, 23:31)VIJO Wrote: [ -> ]I hope and aspire to join your ranks someday, In fact I wrote my first actual useful python script today, LOL.
Cool keep at it, and I hope you enjoy it. And don't hesitate to ask if you need help with something.
(2014-04-28, 23:31)VIJO Wrote: [ -> ]Ruuk I understand and agree with your points. For what its worth, I am attempting to get a hold of a few other developers that I know to come in and assist on the project, As long as you folks are alright with that, I don't want to step on any toes.
I'd love any help I can get Smile
This is actually my first project that has involved working with other people, so I'm learning to play nice with others Smile
(2014-04-28, 23:31)VIJO Wrote: [ -> ]Maybe even someday set up our own TTS Server with various engines. For now, Ill do what you say and stick to the present, lol.
Well keep thinking (dreaming?) about the future, but just remember there are lot of steps between here and there Smile
I have all kinds of ideas of where this can and might go, but I don't always mention them in the forum, because I've learned that people latch on to things and then want them NOW Smile
I also am still discovering possibilities and limitations that alter the landscape as we go along.
(2014-04-28, 23:31)VIJO Wrote: [ -> ]I hope to initiate a Wiki here soon to help update the current and future developers and users alike with the ongoing info in this thread. I also after I get everything fully up and running was going to whip up a youtube video demoing the tts in xbmc.
That sounds cool.
(2014-04-28, 23:31)VIJO Wrote: [ -> ]Also was wondering if maybe we shouldn't move onto a thread in the addons section, As i noticed we are currently still on a very old thread in the general section, lol.
I've thought about that. I've been waiting until I felt it was ready for version 0.1.0, which for me means it moves out of the testing and exploration phase to where a user can reasonably expect it to work without issues most of the time. I suppose that describes now, but I think I'll start a release thread within the next couple of weeks.
(2014-04-28, 23:31)VIJO Wrote: [ -> ]I ordered a pi today, I want to play with all the speech options, Ohhhh, soo many options... lol.
Have fun Smile
(2014-04-28, 23:31)VIJO Wrote: [ -> ]Back to relevant issue on hand, Did you make changes to some of the stuff that is read? Or was it the dropping of the dependencies and bs4, I don't know, That last couple releases it seems things have been reading better in the menu system. Hard for me to point out exactly what i am talking about and maybe its just me getting used to it.
Yes, most relevant change was removing the bs4 dependency. While it was easy to work with for parsing XML, it had a dependency on another library that isn't in the standard library. I managed to find something I could include in the addon that makes it almost as easy, and should work for everyone. In any case, this issue was causing F2 and F3 to never work for basically anyone but me.

I had to edit my post because the number of smilies put me over the image count limit. I guess that means I use smilies too much. <-- No room for a smiley here.
(2014-04-29, 00:36)Traker1001 Wrote: [ -> ]I will be getting Google fiber here within the next year. When I do Ill setup a windows speech server for, (Well Myself), But let everyone else use it as long as bandwidth load doesn't become an issue.
That would be cool of you.
(2014-04-29, 00:36)Traker1001 Wrote: [ -> ]
(2014-04-27, 16:44)eckythump Wrote: [ -> ]You probably want to look at the sources.xml file. On my system it is found at /storage/.xbmc/userdata/sources.xml

I set my video sources up once and forgot about them, so I am not sure how one would go about doing what you want via the GUI, but the sources.xml file lets you specify multiple sources for a given name, like so:
Code:
<source>
    <name>TV Shows</name>
    <path pathversion="1">/storage/tvshows/</path>
    <path pathversion="1">smb://HAPPYSAMBA/media/video/TV/</path>
</source>

This didn't change anything for me. For what its worth, i'm in windows. I'm not trying to change sources. In the Quartz menu, under setting it lets you disable and rename menus. So I disabled the Movies menu and renamed the TV Shows menu to videos. It shows Videos on the home screen, however, TTS reads TV Shows still. Where is the TTS reading from? I'm not trying to change any actual sources, just the name that reads on the home Screen.
I'll have to look again, but the section name may be hard coded into the addon (well hardcoded localized). I may be able to fix that since I added the ability to parse XML after I added the code that does the section names on Quartz.
Added a new version to my repository: 0.0.46.

Get it or the repository from the Downloads Page.

Changes 0.0.45 - 0.0.46:
  • Add setting 'Use aoss' to Cepstral
  • Fix for Cepstral showing terminal on windows
  • Add google HTTP speech server backend
  • Add setting 'Speak On Server' to the HTTP sever backend
  • Add remote speaking to sjhttsd backend

Some of these changes only make sense for the upcoming python speech server that uses the same backends as the addon.
Google will currently only work on systems with mplayer installed.
Cepstral will work on the Pi, but you have to get alsa working, and you probably have to check the 'Use aoss' option. I also can't promise it will work well.

By the way, I can confirm the latest update on OpenElec fixes the crashing problem.
popcornmix: I've been playing around with the newest update. The crashing is gone, but I've been checking out the truncating. Just to be clear, this is only an issue on the Pi.
My log is full of these message with speech on:

Code:
CAESinkPi:AddPackets Underrun (delay:0.00 frames:2205)

They go away when it is off.

It seems to be certain length that plays, rather than some fraction of the total length. Something like 1 to 1.5 seconds.
While I'm sure it's not directly related, it seems to happen to wavs of approximately the same length as the crashing issue.
Yikes, a lot happened while I was sleeping. I should take that as encouragement to sleep longer and more often.

(2014-04-28, 15:05)popcornmix Wrote: [ -> ]If you are referring to a crash occurring on the Pi then no. It's an xbmc issue not an OpenELEC issue, and I'm the person who needs to know (assuming I can reproduce it from your description, it will be fixed).
I'm glad you were able to catch this in the thread, and I'll remember that for any future bugs. Thanks!

(2014-04-28, 16:40)popcornmix Wrote: [ -> ]I've got a fix. Basically a calculation of "CalcDstSampleCount)() = (src_samples * dst_rate + src_rate-1) / src_rate" was overflowing.
I've pushed a fix to newclock3 and gotham_rbp_backports trees, which is used by Milhouse builds, and hopefully the official gotham openelec beta.

Note: this fixes the hang. I can see the last word getting truncated, but that occurs with both the Pi sink and the ALSA sink, so I'm guessing that is a limit of skin sounds.
FernetMenta may be able to confirm this.
Thanks so much for getting the crashing fix sorted so quickly. I'll investigate the truncation some more soon. I previously had no truncation issues, though. The truncation bug showed up around the same time as the crashing bug. Wasn't in OpenELEC 3.95.5, but was in 3.95.6.

(2014-04-28, 17:51)ruuk Wrote: [ -> ]Well, as far as the addon goes, I can probably at least short term fix that by adding silence to the end of the wav.

Unless it cuts sound after a certain duration, it sounds like there must be some other bug that's causing it to drop a certain fraction off the tail of a wav. Maybe eckythump can weigh in on that as he's done the testing with progressively longer wavs.
I'll have to recheck and confirm, but from my observations, it seemed to cut off after a given amount of time into the wav, rather than a certain amount from the end, so I doubt adding silence would help, but I'll confirm and get back to you as soon as I can.

(2014-04-28, 19:35)ruuk Wrote: [ -> ]As XBMC was not built from the ground up with speech, I believe that an addon is definitely the right place for TTS. There is nothing wrong with that, much of what XBMC does is handled by addons. All the skins for XBMC are addons. In fact the push over the years has been to move whatever can be handled by addons into addons.
I concur with this. Shoehorning text-to-speech into xbmc at a low level would be an awful lot of work. You've come a long way already with your addon, and it'll be much easier to get smaller, general purposes changes into xbmc to facilitate adding a feature to the addon where necessary. Python is also a hell of a lot easier to work with than C++ and that opens the arena up to a lot wider audience. It's also much easier to make a change and test it immediately. This is the real strength of using addons.

(2014-04-28, 19:55)ruuk Wrote: [ -> ]eckythump: As you can see the crashing issue has ben fixed by popcornmix. I went through your previous posts and it looks like everything has been addressed. Let me know if I missed something I should answer. I'll have to see what I can do about the wav getting cutoff when I get a chance.
I've made a python HTTP wav server that will serve wavs using the same code as the addon backends. I managed to separate the backends code to a submodule, so that changes I make to the addon will be added to the server without the need for modification. Currently it can use Flite, eSpeak, and pico2wave on unixes and SAPI on windows (I used your perl code to figure it out) to serve wavs. It will also be able to speak the wavs on the server machine using any of the available backends for whatever use someone might have for that. I could imagine that being an easy way for a blind user to have the speech on headphones without others hearing it, for example. It's working pretty well now - there's just a few things I need to do and I'll post a download so you (or anyone) can try it out.
If you didn't already notice, I posted a link to the zip of your github repo for the perl server on the addon downloads page a few days ago.
That's excellent. I look forward to looking at the code for your server. I made a brief attempt at doing it, but something really bizarre was happening when I tried to to write out the wav data to disk, so I got annoyed and went and did something else instead. Smile I did look at and made a basic HTTP framework for it using the BaseHTTPServer (I think) module and I really liked the layout of that and could see how it'd make for a much more elegant and neat design than my perl thing.

I hadn't noticed you linking to httpttsd, but that's cool, too. Once you're happy with your python server, I won't be at all offended if you want to just link to that. It's also been my observation that python stuff seems much easier to package up as standalone packages that people can install/run as proper windows services, and that'd definitely be a great goal (in time) for your python server.

(2014-04-28, 21:16)ruuk Wrote: [ -> ]eckythump: I forgot to say, thanks for all the testing you've been doing!
No worries. Thanks for all your efforts, too. It's exciting to have a media device connected to the TV that's actually accessible. I've also quite enjoyed the development/tinkering. I'm looking forward to taking my pi over to Vision Australia to show them.

One other thought I literally just had was that it might be good to add an option into the config section that lets you toggle between screen-reading mode (what we currently have) and a non-screen-reading mode. The latter essentially being silent all the time, but leaving your addon available for other future addons to leverage it if they want their addons to generate speech, for example if someone wanted to write an addon to speak subtitles.

Alright, I think that's everything. Need to go be productive for a while. Smile
(2014-04-29, 03:57)ruuk Wrote: [ -> ]It seems to be certain length that plays, rather than some fraction of the total length. Something like 1 to 1.5 seconds.
While I'm sure it's not directly related, it seems to happen to wavs of approximately the same length as the crashing issue.

Yes, I've had another look, and have found the cause of the truncation. There was a 256K limit in size (after channel mapping/resampling).
I have a fix for that on newclock3 and gotham_rbp_backports branches. It should appear in a future build.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30