• 1
  • 3
  • 4
  • 5(current)
  • 6
  • 7
  • 8
Voice recognition and control?, just basic!
#61
Sinnocence Wrote:I'm having the same issue, Win 7 64-bit. Mic is selected as the default recording device, I can see the levels in control panel and all looks well. Tried a combination of various settings incl mic level/etc to no effect. On the main UI under 'Microphone' (in voxcommando) the label reads 'audio warnings' though my log is identical to the one posted above (i.e. no discernible audio warnings?).

If you find a fix, could you post here? Thanks.

got it working with some help from jitterjames. first open windows control pannel and select speech recignition, then in "advanced speech options" on the left, change the drop down box for language to the desired one and it should work. i had to change mine to english US even though im in the UK and the rest of my PC is set to UK. This got it working for me.
Reply
#62
That fixed it, thanks. I'm not sure what you mean by "the display language" but all i18n settings are English UK/Irish on these PCs (in fact, I tend to remove/uninstall English US to avoid hiccups in other software), I'm not really sure what would cause that mix-up!

Thanks again.
Reply
#63
it's a microsoft thing. it can't seem to decide if the two language are the same or not. For two totally different languages it is not an issue.
Reply
#64
lettered jump-lists are now working using sendkeys.
The spoken commands are a bit tricky (as I said, some letters sound the same).
I have found that making it look for the phonetic sound of a letter's name is more accurately read than just the letter (Bee or Bea instead of 'B' makes it less likely to be confused with 'C', 'D' etc.) Not sure why, but if the accuracy meter in VC is what we'll gauge this by, it comes back at 10-12% better recognition. So I just need to fine tune each letter. As I figured I'd be doing a lot of editing, I didn't do it as a payload, but as separate entries. It was easier to just write a macro to generate the xml than use the interface. Is there any advantage to using payloads in terms of memory or processor overhead? if so, I'll clean it up when I get the letters working more accurately. Otherwise, they're functional as is.

Another way of improving the accuracy is to have the mic a good distance from your mouth. F and S for example are easily distinguished is the mic is below your chin, but directly in front of your mouth, it's a coin toss as to which it will hear.

Code:
<commandGroup name="SENDKEYS">
        <command name="+A">
            <phrase>Go To A</phrase>
        </command>
        <command name="+B">
            <phrase>Go To Bee, Go To Bea, Go to B</phrase>
        </command>
        <command name="+C">
            <phrase>Go To C, Go to See</phrase>
        </command>
        <command name="+D">
            <phrase>Go To D, Go To Dee</phrase>
        </command>
        <command name="+E">
            <phrase>Go to E</phrase>
        </command>
        <command name="+F">
            <phrase>Go to F, go to eff</phrase>
        </command>
        <command name="+G">
            <phrase>Go to G, go to Gee</phrase>
        </command>
        <command name="+H">
            <phrase>Go to H</phrase>
        </command>
        <command name="+I">
            <phrase>Go to I</phrase>
        </command>
        <command name="+J">
            <phrase>Go to J, go to Jay</phrase>
        </command>
        <command name="+K">
            <phrase>Go yo K, Go to kay</phrase>
        </command>
        <command name="+L">
            <phrase>go to L, go to ELL</phrase>
        </command>
        <command name="+M">
            <phrase>go to M, go to emm</phrase>
        </command>
        <command name="+N">
            <phrase>go to N, go to en</phrase>
        </command>
        <command name="+O">
            <phrase>go to O, go to oh</phrase>
        </command>
        <command name="+P">
            <phrase>go to P, go to Pea</phrase>
        </command>
        <command name="+Q">
            <phrase>Go to Q, go to que</phrase>
        </command>
        <command name="+R">
            <phrase>go to R, go to are</phrase>
        </command>
        <command name="+S">
            <phrase>go to S, go to ess</phrase>
        </command>
        <command name="+T">
            <phrase>Go to T, go to tee</phrase>
        </command>
        <command name="+U">
            <phrase>go to U, go to you</phrase>
        </command>
        <command name="+V">
            <phrase>go to V, go to vee</phrase>
        </command>
        <command name="+W">
            <phrase>go to W, go to double You</phrase>
        </command>
        <command name="+X">
            <phrase>Go to X, go to ex</phrase>
        </command>
        <command name="+Y">
            <phrase>Go to Y, go to why</phrase>
        </command>
        <command name="+Z">
            <phrase>Go to Z, go to zee</phrase>
        </command>
    </commandGroup>

As usual, just copy and paste this in your voicecommands.xml file. I'd be interested to know if people are comfortable using this, and what their accuracy is with different microphone setups.
Reply
#65
yes. there are advantages, especially if you want to reuse them, or if you want to use different command phrases before the actual letter.

If you want to be able to say "jump to a, letter a, goto a" etc. it's less work, and less memory (I think) to use payloads. It may also reduce processing because of the branching logic, but this is only conjecture. I don't know enough about the inner workings of the api.

Also, if you use payload xml files, you can easily reuse the external file in other commands if you want, then you only need to edit in one place. Let's say for example that you also wanted to create a spell it command, so you could say type letter A, type letter B etc. You could reuse your xml with all the letters spelled out phonetically. The other advantage of the external xml payload files is that you can attach multiple aliases to each value, so for A you could assign "A, eh, Alpha, Aardvark" as phrases.

This will make a lot more sense in terms of ease of use when I (eventually) create an xml editor with excel style colums that you can just paste shit into.
Reply
#66
arkryal Wrote:lettered jump-lists are now working using sendkeys.
The spoken commands are a bit tricky (as I said, some letters sound the same)...

Thanks for posting this but it wont work for me, I'm not sure why. XBMC responds correctly to actually pressing shift-Letter on the actual keyboard, but when I use voice commands to do it it doesn't work. I've tried all sorts but it just never responds. In addition, I tried testing it by just opening in notepad and seeing if these commands would invoke keystrokes in there, i.e I expect if I say "Jump To Mike" that I'll see an M typed in notepad, that doesn't happen.

Here's my entire voicecommands.xml

Code:
<?xml version="1.0" encoding="utf-8"?>
<!--voicecommands.xml-->
<!--This should be in the same folder as VoxCommando.exe.-->
<VoiceCommands>
    <commandGroup name="XBMC Testing Functions">
        <command name="Action(11)">
            <phrase>info,more</phrase>
        </command>
        <command name="Action(10)">
            <phrase>return</phrase>
        </command>
        <command name="Action(27)">
            <phrase>Codec</phrase>
        </command>
        <command name="Action(25)">
            <phrase>Subtitles,sub,subs,caption,show text</phrase>
        </command>
        <command name="Action(12)">
            <phrase>pause,unpause,wait,resume</phrase>
        </command>
        <command name="Action(13)">
            <phrase>stop</phrase>
        </command>
        <command name="Mute">
            <phrase>mute,silence,shut up</phrase>
        </command>
        <command name="Exit">
            <phrase>close x b m c, shutdown</phrase>
        </command>
        <command name="Action(79)">
            <phrase>play</phrase>
        </command>
        <command name="Action(20)">
            <phrase>forward</phrase>
        </command>
        <command name="Action(22)">
            <phrase>skip forward</phrase>
        </command>
        <command name="Action(21)">
            <phrase>back</phrase>
        </command>
        <command name="Action(23)">
            <phrase>skip Back</phrase>
        </command>
        <command name="execbuiltin(Action(fastforward))">
            <phrase>Fast Forward</phrase>
        </command>
        <command name="execbuiltin(Action(rewind))">
            <phrase>Rewind</phrase>
        </command>
        <command name="execbuiltin(Action(increaserating))">
            <phrase>I like it, I love it</phrase>
        </command>
        <command name="execbuiltin(Action(decreaserating))">
            <phrase>I don't like it, I hate it, it sucks, it blows</phrase>
        </command>
        <command name="execbuiltin(Action(select))">
            <phrase>Select</phrase>
        </command>
        <command name="execbuiltin(EjectTray())">
            <phrase>Open Tray, Open Drive, Close Tray, Close Drive</phrase>
        </command>
    </commandGroup>
    <commandGroup name="Launch XBMC">
        <command name="D:\Program Files\XBMC\XBMC.exe">
            <phrase>x b m c</phrase>
        </command>
    </commandGroup>
    <commandGroup name="XBMC Jump To windows">
        <command name="execbuiltin(ActivateWindow">
            <phrase>Menu</phrase>
            <payloadFromXML>C:\Users\chris\Desktop\VoxCommando 073 english\payloads\XBMCwindows.xml</payloadFromXML>
        </command>
        <command name="execbuiltin(ActivateWindow(VideoLibrary,RecentlyAddedEpisodes))">
            <phrase>new tv, new shows, new episodes, latest tv, latest episodes</phrase>
        </command>
        <command name="execbuiltin(ActivateWindow(VideoLibrary,TVShowTitles))">
            <phrase>tv shows, tee vee shows, television</phrase>
        </command>
        <command name="execbuiltin(ActivateWindow(VideoLibrary,TVShowYear))">
            <phrase>tv by year</phrase>
        </command>
        <command name="execbuiltin(ActivateWindow(VideoLibrary,TVShowActors))">
            <phrase>tv by actors</phrase>
        </command>
        <command name="execbuiltin(ActivateWindow(VideoLibrary,TVShowGenres))">
            <phrase>tv by genre</phrase>
        </command>
        <command name="execbuiltin(ActivateWindow(videolibrary,movietitles))">
            <phrase>movies by title, movies, films, film</phrase>
        </command>
        <command name="execbuiltin(ActivateWindow(videolibrary,RecentlyAddedMovies ))">
            <phrase>new movies, new films, recently added films</phrase>
        </command>
        <command name="execbuiltin(ActivateWindow(videolibrary,moviegenres))">
            <phrase>movies by genre</phrase>
        </command>
        <command name="execbuiltin(ActivateWindow(videolibrary,movieyear))">
            <phrase>movies by year</phrase>
        </command>
    </commandGroup>
    <commandGroup name="XBMC Simple Navigation">
        <command name="action(1)">
            <phrase>left</phrase>
        </command>
        <command name="action(2)">
            <phrase>right</phrase>
        </command>
        <command name="action(3)">
            <phrase>up</phrase>
        </command>
        <command name="action(4)">
            <phrase>down</phrase>
        </command>
    </commandGroup>
<commandGroup name="SENDKEYS misc">
        <command name="+A">
            <phrase>Jump To Alpha</phrase>
        </command>
        <command name="+B">
            <phrase>Jump To Beta</phrase>
        </command>
        <command name="+C">
            <phrase>Jump To Charlie</phrase>
        </command>
        <command name="+D">
            <phrase>Jump To Delta</phrase>
        </command>
        <command name="+E">
            <phrase>Jump To Echo</phrase>
        </command>
        <command name="+F">
            <phrase>Jump To Foxtrot</phrase>
        </command>
        <command name="+G">
            <phrase>Jump To Golf</phrase>
        </command>
        <command name="+H">
            <phrase>Jump To Hotel</phrase>
        </command>
        <command name="+I">
            <phrase>Jump To Indigo</phrase>
        </command>
        <command name="+J">
            <phrase>Jump To Juliet</phrase>
        </command>
        <command name="+K">
            <phrase>Jump To Kilo</phrase>
        </command>
        <command name="+L">
            <phrase>Jump To Lema</phrase>
        </command>
        <command name="+M">
            <phrase>Jump To Mike</phrase>
        </command>
        <command name="+N">
            <phrase>Jump To November</phrase>
        </command>
        <command name="+O">
            <phrase>Jump To Oscar</phrase>
        </command>
        <command name="+P">
            <phrase>Jump To Papa</phrase>
        </command>
        <command name="+Q">
            <phrase>Jump To Quebec</phrase>
        </command>
        <command name="+R">
            <phrase>Jump To Romeo</phrase>
        </command>
        <command name="+S">
            <phrase>Jump To Sierra</phrase>
        </command>
        <command name="+T">
            <phrase>Jump To Tango</phrase>
        </command>
        <command name="+U">
            <phrase>Jump To Uniform</phrase>
        </command>
        <command name="+V">
            <phrase>Jump To Victor</phrase>
        </command>
        <command name="+W">
            <phrase>Jump To Whiskey</phrase>
        </command>
        <command name="+X">
            <phrase>Jump To X-Ray</phrase>
        </command>
        <command name="+Y">
            <phrase>Jump To Yankee</phrase>
        </command>
        <command name="+Z">
            <phrase>Jump To Zulu</phrase>
        </command>
    </commandGroup>
</VoiceCommands>

and XBMCwindows.xml:

Code:
<?xml version="1.0" encoding="utf-8"?>
<!--xbmc windows-->
<!--see:    http://xbmc.svn.sourceforge.net/viewvc/xbmc/trunk/guilib/Key.h?revision=31448&content-type=text/plain-->


<PayloadsRoot>
    <payload>
        <value>home)</value>
        <phrase>home, top menu, start page</phrase>        
    </payload>
    <payload>
        <value>settings)</value>
        <phrase>settings, setup, options</phrase>        
    </payload>
    <payload>
        <value>pictures)</value>
        <phrase>photos, pictures</phrase>        
    </payload>
    <payload>
        <value>filemanager)</value>
        <phrase>file manager</phrase>        
    </payload>
    <payload>
        <value>music)</value>
        <phrase>music, tunes, songs</phrase>        
    </payload>
</PayloadsRoot>

I also found the recognition rate very low, as a compromise for now I replaced all the letters with the Nato phonetic alphabet, which gives me way way higher accuracy.

Hope you guys can help, this is so close to awesome! Thanks for reading.
Reply
#67
I can verify that you voicecommands.xml is OK. I replaced mine with the file you posted and it works fine.

I am surprised that it doesn't event work in notepad. This is going to sound stupid, but you are running everything on the same machine right? OK good.

What OS are you on?
Do other XBMC commands work OK for you?

- Try disabling UAC (if it is on) to see if that makes any difference.
- I guess you could also try running VC as administrator.

The sendkeys method that I am calling is not very robust. It doesn't work with certain versions of windows media center, and I'm sure it doesn't work with video games etc.

How about the SMS actions? Does execbuiltin(Action(jumpsms4)) take you to letter g?
Reply
#68
I'm on win7 32bit, and other xbmc commands work great, general navigation etc is working, basically anything through the httpAPI is fine but I don't seem to be able to emulate a keyboard.

I can always install event ghost I guess, and call that to press keys for me, although if there's anything I can do to help in terms of getting this working, debug builds or something? let me know. There's nothing of note in the regular logging btw.

I'll give jumpSMS a go tomorrow, since it's late now and I feel silly keeping the neighbours up going "jump to mike" all night, they must think I've cracked! On that note, if I could right click a command in the tree and do a test firing without having to worry about saying the command first, that'd make this whole process a lot easier to work out! just a thought, thanks either way!

edit: I have UAC on, really loathe to turn it off, running as administrator makes no difference. I'll test turning UAC off tomorrow too.
Reply
#69
I think jumpsms will work for you then if the other http stuff does, and honestly I'd rather put together a solution using that anyway so that we have something that works reliably through the lan in case you want to run vox somewhere else. (e.g. on the deck with your laptop controlling your home stereo or whatever)

The real solution is to enable some kind of macro function so that you can call multiple keypresses etc.

I did have a double click to test at one point but I took it out when things got more complicated with payloads etc. I guess I should put it back in...

If you really want to put a lot of logic and context etc into it, you're going to have to use eventGhost anyway since I'm booked with work until at least late sept. and won't be doing any major overhauls until then.

I'll still be doing some stuff though cause I can't resist!

I promise to get artist requesting by name going within the next week. I have a test working and I just have to modify the options interface etc. to let you choose phrases for requesting music by artist... After that I'll look at Albums, then Movies, then TV. I think TV will be the hardest.

edit: yes please try turning uac off temporarily, just for future reference. It probably won't work anyway.
Reply
#70
The only difference I see between your file and mine id that your command group is called 'SENDKEYS misc', rather than just 'SENDKEYS'. Maybe jitterjames can tell us, but it may be that for keyboard emulation, there's a strict naming required. I 'm not at my mediacenter now, so I can't test, but everything else looks good to me.

As for accuracy, I get variable success. I found the positioning of the mic to be vital. Properly positioned, I get about 90% right on the first try (Still low in my opinion). Of course, I ultimately intend to run this on my HTPC rather than my PC, which means I'll be using a bi-directional mic from 10-15 feet away. I agree, the accuracy won't be sufficient for that scenario, even with great equipment. The phonetic lettering scheme you used will be useful, I'll incorporate it in my commands as an alternate phrase.

Ideally, at least with Movies and TV, possibly actor names, I'd like to have an auto scan of the library and dump all the names and titles to their own commands. If a user had 2000-3000 movies and 50-100 TV shows, that would be a huge library compared to what most people have, but still manageable (not adding significantly to overhead). The issue is, how do we go about this? Would an applicable function be added to VC, would there be an external app (could even be a batch file), or a python script that auto runs with xbmc, updates commands and reloads the VC process... There are many ways to do it. I think it may be a bit early to say which method is ultimately better.

VC is a good general purpose program, not XBMC specific, so it may not be practical to incorporate such specific functionality directly in the program. On the other hand, forcing users to use yet another external program or script may get too confusing for the casual user. With the conversion of XBMC to JSON and the depreciation of the HTTPAPI, there my be some new method emerging to accomplish a lot of this. And as VC matures as a program, there may be better means of implementing this. Doing anything that extensive now seems like it might be futile, rendered useless in the next few weeks/months. I'd consider everything in the XML now to be a proof of concept and subject to change soon.
Reply
#71
it is only strict in that it should START with "sendkeys"
Reply
#72
Had a crack at using jumpsms, I think the repeat code is possibly tripping up?

If I do this:

Code:
<commandGroup name="XBMC LetterJumps">
    <command name="execbuiltin(Action(jumpsms6))">
            <phrase>Letter Jump Mike</phrase>
        </command>
        <command name="[repeat:2]execbuiltin(Action(jumpsms6))">
            <phrase>Letter Jump November</phrase>
        </command>
        <command name="[repeat:4]execbuiltin(Action(jumpsms7))">
            <phrase>Letter Jump Sierra</phrase>
        </command>
    </commandGroup>

Then Mike works fine, but november and Sierra make this happen in VoxLog:

Code:
11/07/2010 06:16:25    VoxLog created:
11/07/2010 06:16:25    Starting VoxCommando, version: 0.732
11/07/2010 06:16:25    error starting directory watcher.  wrong folder?
11/07/2010 06:16:25    installed language:English (United States)
11/07/2010 06:16:25    installed language:English (United Kingdom)
11/07/2010 06:16:25    Loading Command Grammar
11/07/2010 06:16:38    ----- Speech Recognized -----
11/07/2010 06:16:38    Grammar:XBMC LetterJumps
11/07/2010 06:16:38    Grm priority:0
11/07/2010 06:16:38    command: Letter Jump Sierra  (70.7%)
11/07/2010 06:16:38    XBMC: [repeat:4]execbuiltin(Action(jumpsms7))
11/07/2010 06:16:38    XBMC: <html>
<li>Error:Unknown command
</html>

11/07/2010 06:16:38    XBMC: execbuiltin(Action(jumpsms7))
11/07/2010 06:16:38    XBMC: <html>
<li>OK
</html>

11/07/2010 06:16:38    XBMC: execbuiltin(Action(jumpsms7))
11/07/2010 06:16:38    XBMC: <html>
<li>OK
</html>

11/07/2010 06:16:38    XBMC: execbuiltin(Action(jumpsms7))
11/07/2010 06:16:38    XBMC: <html>
<li>OK
</html>

11/07/2010 06:16:38    XBMC: execbuiltin(Action(jumpsms7))
11/07/2010 06:16:38    XBMC: <html>
<li>OK
</html>

11/07/2010 06:16:38    XBMC: execbuiltin(Action(jumpsms7))
11/07/2010 06:16:38    XBMC: <html>
<li>OK
</html>

11/07/2010 06:16:38    XBMC: execbuiltin(Action(jumpsms7))


It goes on like that until I interupt it (hundreds of times, I was curious!) , obviously I need repeat so I can do anything other than a , d , g , j , m , p , t and w. Any ideas? is repeat actually even intended to work outside of sendkeys?
Reply
#73
repeat should probably be called loop. it goes on repeating until you do another commands, and the number is not the number of repetitions, it is the pause in silli-seconds!

I don't have a repeat command. I trying to figure out the best way to do it that won't break what users are already doing, and that will be flexible in the future when we do macros etc. I think for now

[repeat:500] will repeat indefinitely with a 500 ms pause and
[repeat:100:3] will repeat 3 times with a 100 ms pause

ya I think that'll be good...
Reply
#74
jitterjames Wrote:repeat should probably be called loop. it goes on repeating until you do another commands, and the number is not the number of repetitions, it is the pause in silli-seconds!

I don't have a repeat command. I trying to figure out the best way to do it that won't break what users are already doing, and that will be flexible in the future when we do macros etc. I think for now

[repeat:500] will repeat indefinitely with a 500 ms pause and
[repeat:100:3] will repeat 3 times with a 100 ms pause

ya I think that'll be good...

Oh I see now, yeah like you say I need to be able to specify a number of repeats for jumpSMS (and filterSMS!) to use those effectively.

Code:
[repeat:500] will repeat indefinitely with a 500 ms pause and
[repeat:100:3] will repeat 3 times with a 100 ms pause

Sounds awesome, love it.

Do you know how long it might be to implement that? Only I'll gladly use EventGhost to do this stuff for me if it's going to be a while, just wondering wether I should get started with that or wait for limited repeat support in vox?

I'd really like to be able to implement all the major functionality without EG so I can post my xmls here and make it basically plug+play for people with less time on their hands than myself Laugh

Thanks anyway for this awesome little app, I've had lots of fun playing with it this weekend!
Reply
#75
I'm going to do it today!

The only problem is that I have a few mods going on at once so I have to clean some stuff up before I post it, so I may not be able to manage it until tomorrow.

EG has a lot of other great features so it is well worth your time

... but I agree, I'd like as much as possible for VoxCommando to operate stand alone so that one day it can be as plug and play as possible for those with the little brains and/or tight schedules!
Reply
  • 1
  • 3
  • 4
  • 5(current)
  • 6
  • 7
  • 8

Logout Mark Read Team Forum Stats Members Help
Voice recognition and control?, just basic!2