RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - User 325245 - 2017-02-03
(2017-02-02, 23:21)misa Wrote: (2017-02-02, 12:11)DaLanik Wrote: v4.4
- Improve heuristic search (thanx Peter!)
Yes I am stupid But what does this mean > heuristic search
It means it is using some logic to find ads and unwanted lines in subtitles instead of definitions (similar to how antivirus works, but not that complex)
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - User 325245 - 2017-02-03
v4.5
- Change all xml settings reading from xml.dom to json
- Added Dutch translation (thanx Peter!)
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - johndoe64 - 2017-02-03
Hello,
I can't download the new version as the repo seems to be down. Where to download the latest version?
T.I.A.
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - User 325245 - 2017-02-03
Just checked, it is online...
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - User 325245 - 2017-02-04
v4.6
- json paths bugfix when scanning library
Sorry for the bug
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - User 325245 - 2017-02-05
v4.7
- Improve logic
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - HeresJohnny - 2017-02-10
Hi there,
reporting back after my NFS request. Unfortunately, CleanSubs has never worked for me, neither through SMB nor NFS. I'm adding the relevant part of a debug log, maybe you find something...
https://hastebin.com/xakeziqapi.tex
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - User 325245 - 2017-02-11
Is that with the latest verson? I have tested with local NFS server and it worked for me. I'll examine the log anyways
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - User 325245 - 2017-02-12
v4.8
- Uses language(s) selected in Kodi subtitle settings
With this setting ON (Default), cleansubs will use only definitions for languages defined in Kodi subtitle settings. Serbian/Serbian Cyrillic/Serbo-Croatian/Croatian/Bosnian are treated as one (serbo-croatian) language, which ofcourse they are. Custom subscribtions from web portal (http://cleansub.heliohost.org) no longer work.
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - User 325245 - 2017-02-13
(2017-02-10, 23:39)HeresJohnny Wrote: Hi there,
reporting back after my NFS request. Unfortunately, CleanSubs has never worked for me, neither through SMB nor NFS. I'm adding the relevant part of a debug log, maybe you find something...
https://hastebin.com/xakeziqapi.tex
OK, here's a fix that should work for you. Let me know:
v4.9
- Bug fix
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - User 325245 - 2017-02-15
v5.0
- Optimize stats collection
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - thezoggy - 2017-02-23
fresh kodi v17 install (using mysql backend / sources.xml), installed repo, installed cleansubs.. went to configure it.. seeing:
Code: 16:14:32.445 T:7796 ERROR: EXCEPTION Thrown (PythonToCppException) : -->Python callback/script returned the following error<--
- NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
Error Type: <type 'exceptions.IndexError'>
Error Contents: list index out of range
Traceback (most recent call last):
File "C:\Users\zoggy\AppData\Roaming\Kodi\addons\service.cleansubs\default.py", line 41, in onSettingsChanged
GetSettings()
File "C:\Users\zoggy\AppData\Roaming\Kodi\addons\service.cleansubs\default.py", line 91, in GetSettings
Lang[i] = Lang[i].upper()
IndexError: list index out of range
-->End of Python script error report<--
restarted kodi, it hung on exiting due to cleansubs db being locked.. eventually exited. went back in, configured cleansubs manual folder.. put two .srt files in the folder to test.
then went to addons > cleansubs to run it.. clicked once.. nothing happen visually in kodi.. looking at logs:
Code: 16:16:15.288 T:4864 DEBUG: CPythonInvoker(11, C:\Users\zoggy\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py): start processing
16:16:15.300 T:4864 DEBUG: -->Python Interpreter Initialized<--
16:16:15.300 T:4864 DEBUG: CPythonInvoker(11, C:\Users\zoggy\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py): the source file to load is "C:\Users\zoggy\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py"
16:16:15.300 T:4864 DEBUG: CPythonInvoker(11, C:\Users\zoggy\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py): setting the Python path to C:\Users\zoggy\AppData\Roaming\Kodi\addons\service.cleansubs;C:\Program Files (x86)\Kodi\addons\script.module.pil\lib;C:\Users\zoggy\AppData\Roaming\Kodi\addons\script.module.beautifulsoup\lib;C:\Users\zoggy\AppData\Roaming\Kodi\addons\script.module.myconnpy\lib;C:\Program Files (x86)\Kodi\system\python\DLLs;C:\Program Files (x86)\Kodi\system\python\Lib;C:\Program Files (x86)\Kodi\python27.zip;C:\Program Files (x86)\Kodi\system\python\lib\plat-win;C:\Program Files (x86)\Kodi\system\python\lib\lib-tk;C:\Program Files (x86)\Kodi;C:\Program Files (x86)\Kodi\system\python;C:\Program Files (x86)\Kodi\system\python\lib\site-packages
16:16:15.300 T:4864 DEBUG: CPythonInvoker(11, C:\Users\zoggy\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py): entering source directory C:\Users\zoggy\AppData\Roaming\Kodi\addons\service.cleansubs
16:16:15.300 T:4864 DEBUG: CPythonInvoker(11, C:\Users\zoggy\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py): instantiating addon using automatically obtained id of "service.cleansubs" dependent on version 2.1.0 of the xbmc.python api
16:16:15.768 T:4864 DEBUG: CLEANSUBS >> DEFINITIONS >> NO NEW DEFINITIONS (L:21244 == R:21244)
16:16:15.852 T:4864 DEBUG: CLEANSUBS >> DELETED AND CREATED NEW DEF DB
after a delay, finally came back with a popup asking what to clean...
Code: 16:19:00.185 T:4864 DEBUG: CLEANSUBS STANDALONE >> FILE: >>test-RARBG_track4_eng.srt<<
16:19:00.186 T:4864 DEBUG: CLEANSUBS >> SUB STATS WILL BE ADDED TO LOCAL DATABASE
16:19:00.186 T:4864 DEBUG: CLEANSUBS >> ENC >> OPENED WITH ENCODING: utf-8
16:19:00.213 T:4864 DEBUG: CLEANSUBS >> PROCESSED IN 0.03 SECONDS, REMOVED 78 LINES
so it cleaned the sub.. but causes cosmetic artifacts and didnt cleanup the rarbg advertising...
this went from:
Code: 00:10:30,230 --> 00:10:34,265
but [Groans] this --
to: (double space after but)
Code: 00:10:30,230 --> 00:10:34,265
but this --
this is too much.
thus may want to run a replace ' ' with ' ' after cleanup is done to reduce the cosmetic stuff..
then with the CC cleanup, its not cleaning up the 'music' stuff like:
Code: 00:14:38,544 --> 00:14:41,646
¶¶
then finally, looks like you need to add cleanup string for rarbg:
Code: 00:18:02,838 --> 00:18:04,838
Torrent downloaded by RARBG
also noticed that when it did cleanup the file it caused this dupe:
Code: 1
00:00:01,434 --> 00:00:03,602
1
<i> Previously on "marvel's</i>
<i> agents of S.H.I.E.L.D."...</i>
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - HeresJohnny - 2017-02-23
Hi there, reporting back for the path clean function. I see several problems in my debug log, of which I'm posting a sample:
Code: 20:14:32.231 T:9388 DEBUG: Thread LanguageInvoker start, auto delete: false
20:14:32.232 T:9388 INFO: initializing python engine.
20:14:32.232 T:9388 DEBUG: CPythonInvoker(61, C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py): start processing
20:14:32.298 T:9388 DEBUG: -->Python Interpreter Initialized<--
20:14:32.298 T:9388 DEBUG: CPythonInvoker(61, C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py): the source file to load is "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py"
20:14:32.298 T:9388 DEBUG: CPythonInvoker(61, C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py): setting the Python path to C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs;C:\Program Files (x86)\Kodi\addons\script.module.pil\lib;C:\Users\JoScha\AppData\Roaming\Kodi\addons\script.module.beautifulsoup\lib;C:\Users\JoScha\AppData\Roaming\Kodi\addons\script.module.myconnpy\lib;C:\Program Files (x86)\Kodi\system\python\DLLs;C:\Program Files (x86)\Kodi\system\python\Lib;C:\Program Files (x86)\Kodi\python27.zip;C:\Program Files (x86)\Kodi\system\python\lib\plat-win;C:\Program Files (x86)\Kodi\system\python\lib\lib-tk;C:\Program Files (x86)\Kodi;C:\Program Files (x86)\Kodi\system\python;C:\Program Files (x86)\Kodi\system\python\lib\site-packages
20:14:32.298 T:9388 DEBUG: CPythonInvoker(61, C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py): entering source directory C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs
20:14:32.298 T:9388 DEBUG: CPythonInvoker(61, C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py): instantiating addon using automatically obtained id of "service.cleansubs" dependent on version 2.1.0 of the xbmc.python api
20:14:32.634 T:9388 DEBUG: CLEANSUBS >> DEFINITIONS >> NO NEW DEFINITIONS (L:21244 == R:21244)
20:14:32.693 T:9388 DEBUG: CLEANSUBS >> DELETED AND CREATED NEW DEF DB
20:14:43.619 T:9388 DEBUG: CLEANSUBS >> READ TOTAL DEFINITIONS: 0 elements
20:14:43.619 T:9388 DEBUG: CLEANSUBS STANDALONE >> STARTED VERSION 5.0
20:14:43.620 T:9388 DEBUG: JSONRPC: Incoming request: {
"jsonrpc": "2.0",
"id": 1,
"method": "Files.GetSources",
"params": {
"media": "video"
}
}
20:14:43.620 T:9388 DEBUG: CLEANSUBS >> VIDEO PATHS >> multipath://nfs%3a%2f%2f192.168.1.185%2fd%2fTV-Movie%2f_Movie%2fAction-Adventure-Western%2f/nfs%3a%2f%2f192.168.1.185%2fd%2fTV-Movie%2f_Movie%2fAsian%2f/nfs%3a%2f%2f192.168.1.185%2fd%2fTV-Movie%2f_Movie%2fComedy-Family-Romance%2f/nfs%3a%2f%2f192.168.1.185%2fd%2fTV-Movie%2f_Movie%2fCrime-Suspense-Mystery%2f/nfs%3a%2f%2f192.168.1.185%2fd%2fTV-Movie%2f_Movie%2fDrama-War%2f/nfs%3a%2f%2f192.168.1.185%2fd%2fTV-Movie%2f_Movie%2fHorror%2f/nfs%3a%2f%2f192.168.1.185%2fd%2fTV-Movie%2f_Movie%2fSf-Fantasy%2f/nfs%3a%2f%2f192.168.1.185%2ft%2fTV-Movie2%2f_Doku%2f/nfs%3a%2f%2f192.168.1.185%2ft%2fTV-Movie2%2f_Anime%2f_Movies%2f/
20:14:43.620 T:9388 DEBUG: CLEANSUBS >> VIDEO PATHS >> nfs://192.168.1.185/t/TV-Movie2/_Anime/_Series/
20:14:43.620 T:9388 DEBUG: CLEANSUBS >> VIDEO PATHS >> nfs://192.168.1.185/t/TV-Movie2/_tv/
20:14:43.620 T:9388 DEBUG: CLEANSUBS >> VIDEO PATHS >> nfs://192.168.1.185/q/Music/_dvd-V/
20:14:43.620 T:9388 DEBUG: CLEANSUBS >> VIDEO PATHS >> nfs://192.168.1.185/d/TV-Movie/_Movie/Animation/
20:14:48.486 T:9388 DEBUG: DialogProgress::Open called
20:14:48.486 T:9388 DEBUG: ------ Window Init (DialogConfirm.xml) ------
20:16:11.077 T:9388 DEBUG: CLEANSUBS STANDALONE >> BEGIN PATH: >>\\POSTMAN\TV-Movie2\_tv\<< FOLDERS IN PATH: >>362<<
...
20:16:51.647 T:9388 DEBUG: CLEANSUBS STANDALONE >> FILE: >>American.Horror.Story.S02E12.en.srt<<
20:16:51.677 T:9388 DEBUG: CLEANSUBS >> SQL ERROR IN CheckDatabase
20:16:51.677 T:9388 DEBUG: CLEANSUBS >> SUB STATS WILL BE ADDED TO LOCAL DATABASE
20:16:51.692 T:9388 DEBUG: CLEANSUBS >> ENC >> OPENED WITH ENCODING: utf-8
20:16:51.713 T:9388 DEBUG: CLEANSUBS >> SQL ERROR IN AddtoDatabase : no such table: stats
20:16:51.736 T:9388 DEBUG: Previous line repeats 1 times.
20:16:51.736 T:9388 DEBUG: CLEANSUBS >> PROCESSED IN 0.09 SECONDS, NO LINES REMOVED
20:16:51.736 T:9388 DEBUG: CLEANSUBS STANDALONE >> FILE: >>American.Horror.Story.S02E12.ja.srt<<
20:16:51.766 T:9388 DEBUG: CLEANSUBS >> SQL ERROR IN CheckDatabase
20:16:51.767 T:9388 DEBUG: CLEANSUBS >> SUB STATS WILL BE ADDED TO LOCAL DATABASE
20:16:51.777 T:9388 DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING utf-8
20:16:51.782 T:9388 DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING cp1250
20:16:51.792 T:9388 DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING cp1251
20:16:51.797 T:9388 DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING cp1252
20:16:51.807 T:9388 DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING cp1253
20:16:51.817 T:9388 DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING cp1254
20:16:51.827 T:9388 DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING cp1257
20:16:51.827 T:9388 DEBUG: CLEANSUBS >> ENC >> OPENED WITH KODI ENCODING:
20:16:51.848 T:9388 ERROR: EXCEPTION Thrown (PythonToCppException) : -->Python callback/script returned the following error<--
- NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
Error Type: <type 'exceptions.LookupError'>
Error Contents: unknown encoding:
Traceback (most recent call last):
File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py", line 290, in <module>
intCancel = scanPaths(manFolder, 1, 1, 3)
File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py", line 176, in scanPaths
process_subs(os.path.join(path, basePath, name), 1)
File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\default.py", line 300, in process_subs
file_input = codecs.open(fileName, 'r', SubCharset, errors='ignore')
File "C:\Program Files (x86)\Kodi\system\python\Lib\codecs.py", line 899, in open
info = lookup(encoding)
LookupError: unknown encoding:
-->End of Python script error report<--
1. It seems like there's problem with a non-existing table "stats".
2. It seems like there's a problem with double-byte encoded subs, in my case codepage 932 ANSI/OEM Japanese (Shift JIS). That's the part where Cleansubs tries a few different codepages and finally fails with erroring out. Maybe this could be made more resilient by skipping.
3. The cleaning itself seems to fail partially. Example:
American.Horror.Story.S02E12.en.srt vs. American.Horror.Story.S02E12.en.srt_ORIGINAL
Cleansubs manages to clean the last lines of the sub which are
Code: 728
00:42:42,598 --> 00:42:52,817
<font color="#ec14bd">Sync & corrections by honeybunny</font>
<font color="#ec14bd">www.addic7ed.com</font>
However, it fails to clean stuff from the top which still has
Code: 1
00:00:48,917 --> 00:00:51,152
Daddy?
2
00:00:51,220 --> 00:00:53,788
Daddy'll be there in a minute.
3
00:01:48,608 --> 00:01:58,632
<font color="#ec14bd">Sync & corrections by honeybunny</font>
<font color="#ec14bd">www.addic7ed.com</font>
I'll be back with some more tests about NFS.
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - HeresJohnny - 2017-02-23
(2017-02-23, 00:18)thezoggy Wrote: then with the CC cleanup, its not cleaning up the 'music' stuff like:
Code: 00:14:38,544 --> 00:14:41,646
¶¶
Please leave those in there as it gives a hint which parts are lyrics and which part is spoken.
RE: CleanSubs - (Clean subtitles from the ads and other rubbish) - thezoggy - 2017-02-23
(2017-02-23, 21:56)HeresJohnny Wrote: (2017-02-23, 00:18)thezoggy Wrote: then with the CC cleanup, its not cleaning up the 'music' stuff like:
Code: 00:14:38,544 --> 00:14:41,646
¶¶
Please leave those in there as it gives a hint which parts are lyrics and which part is spoken.
(2017-02-23, 00:18)thezoggy Wrote: then finally, looks like you need to add cleanup string for rarbg:
Code: 00:18:02,838 --> 00:18:04,838
Torrent downloaded by RARBG
Pretty sure you shouldn't post stuff like that here.
Its a valid string to cleanup. I'm not posting release names (like you are) or how to get them.. which is the part you shouldn't be doing.
And if you dont want it cleaning up CC related entries.. dont use that option.
For better context of the ¶¶ entries, you can see that they are just fillers for a montage?.. no lyrics or anything being used... if your cleaning up sounds "[ door creaks ]" might as well cleanup the visual sound clue too..
Code: 00:02:45,633 --> 00:02:47,633
Where?
Nome, Alaska.
69
00:02:47,635 --> 00:02:51,403
¶¶
70
00:02:53,039 --> 00:02:54,873
[ door creaks ]
71
00:02:54,875 --> 00:02:57,709
[ Wind howling ]
72
00:02:57,711 --> 00:03:01,246
[ Door closes ]
73
00:03:09,556 --> 00:03:11,456
Mack: Coulson.
74
00:03:19,799 --> 00:03:22,467
What the hell is this?
75
00:03:29,542 --> 00:03:31,843
It's...You.
76
00:03:31,845 --> 00:03:33,979
¶¶
77
00:04:12,619 --> 00:04:15,487
¶¶
78
00:04:30,703 --> 00:04:31,803
doctor.
|