Kodi Community Forum

Full Version: CleanSubs - (Clean subtitles from the ads and other rubbish)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
(2017-02-23, 23:04)thezoggy Wrote: [ -> ]I'm not posting release names (like you are)

Touché. This is his mother. My son's been grounded for a week.
Will have a look once I have time, right now I'm working on KCleaner addon Smile To submit strings use web site: http://cleansub.heliohost.org/
v5.1
- Bug fix

Sorry for bug. Seems no languages except "generic" were used if set to use "kodi subtitle language settings", my bad, should be fixed.
(2017-02-23, 00:18)thezoggy Wrote: [ -> ]Lang[i] = Lang[i].upper()
IndexError: list index out of range
-->End of Python script error report<--

Don't know why it did this. Strange. Hopefully it won't happen again.


(2017-02-23, 00:18)thezoggy Wrote: [ -> ]this went from:
Code:
00:10:30,230 --> 00:10:34,265
but [Groans] this --

to: (double space after but)
Code:
00:10:30,230 --> 00:10:34,265
but  this --
this is too much.

OK, will fix this in next release.

(2017-02-23, 00:18)thezoggy Wrote: [ -> ]then with the CC cleanup, its not cleaning up the 'music' stuff like:

Code:
00:14:38,544 --> 00:14:41,646
¶¶

Don't know if I should fix this.... it is not advertizing etc...

(2017-02-23, 00:18)thezoggy Wrote: [ -> ]then finally, looks like you need to add cleanup string for rarbg:

Code:
00:18:02,838 --> 00:18:04,838
Torrent downloaded by RARBG

Not a problem to add to definitions. Will be in next release.
(2017-02-23, 21:49)HeresJohnny Wrote: [ -> ]3. The cleaning itself seems to fail partially. Example:
American.Horror.Story.S02E12.en.srt vs. American.Horror.Story.S02E12.en.srt_ORIGINAL

Cleansubs manages to clean the last lines of the sub which are
Code:
728
00:42:42,598 --> 00:42:52,817
<font color="#ec14bd">Sync & corrections by honeybunny</font>
<font color="#ec14bd">www.addic7ed.com</font>

However, it fails to clean stuff from the top which still has
Code:
1
00:00:48,917 --> 00:00:51,152
Daddy?

2
00:00:51,220 --> 00:00:53,788
Daddy'll be there in a minute.

3
00:01:48,608 --> 00:01:58,632
<font color="#ec14bd">Sync & corrections by honeybunny</font>
<font color="#ec14bd">www.addic7ed.com</font>

I'll be back with some more tests about NFS.

The only way I can explain this is if you have only heuristics turned on and definitions off. Then by default heur will clean only first 2 lines and last 2 lines.
v5.2
- Clean double spaces after cleaning hearing impaired
- Added more codepages (hebrew, arabic, japanese)
The addon fails on K18 "Leia" with the following message (not from debug log):

Code:
22:59:16.098 T:7864   ERROR: Previous line repeats 20 times.
22:59:16.098 T:7864   ERROR: EXCEPTION Thrown (PythonToCppException) : -->Python callback/script returned the following error<--
                                             - NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
                                            Error Type: <class 'sqlite3.ProgrammingError'>
                                            Error Contents: Cannot operate on a closed database.
                                            Traceback (most recent call last):
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\default.py", line 1294, in <module>
                                                if getDefinitions() != 0:
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\default.py", line 661, in getDefinitions
                                                c.execute("SELECT definition FROM definitions")
                                            ProgrammingError: Cannot operate on a closed database.
                                            -->End of Python script error report<--
Damn, and I just installed Krypton Smile

Will have a look....
v5.3
- Bug fix

It was a bug not only in Lea... I was concentrated on making the "auto language selection" work and messed up the other part Smile Should be fixed anyway.
Thumbs up, Dalai Lanik :-)
CleanSubs just had a great run, cleaning a third of my subs until it errorred out. I think you're very close to perfection. Here's what I can glean from the normal log, will run some debug tests later to identify the subs which caused this.

Code:
10:05:56.839 T:1652   ERROR: EXCEPTION Thrown (PythonToCppException) : -->Python callback/script returned the following error<--
                                             - NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
                                            Error Type: <type 'exceptions.IndexError'>
                                            Error Contents: string index out of range
                                            Traceback (most recent call last):
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py", line 290, in <module>
                                                intCancel = scanPaths(manFolder, 1, 1, 3)
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py", line 176, in scanPaths
                                                process_subs(os.path.join(path, basePath, name), 1)
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\default.py", line 414, in process_subs
                                                if not line.text[subsPos - 1].isalnum() or not line.text[subsPos + 1].isalnum():
                                            IndexError: string index out of range
                                            -->End of Python script error report<--
Here goes. This run, the script got much futher which leads me to believe that it might be a memory problem. There are no real indicators, though, except for that stats hint.

Code:
10:24:50.102 T:1596   DEBUG: CLEANSUBS STANDALONE >> FILE: >>Hannibal.S02E02.HDTV.x264-LOL.ja.srt<<
10:24:50.169 T:1596   DEBUG: CLEANSUBS >> SQL ERROR IN CheckDatabase
10:24:50.170 T:1596   DEBUG: CLEANSUBS >> SUB STATS WILL BE ADDED TO LOCAL DATABASE
10:24:50.178 T:1596   DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING utf-8
10:24:50.182 T:1596   DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING cp1250
10:24:50.187 T:1596   DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING cp1251
10:24:50.191 T:1596   DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING cp1252
10:24:50.194 T:1596   DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING cp1253
10:24:50.199 T:1596   DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING cp1254
10:24:50.203 T:1596   DEBUG: CLEANSUBS >> ENC >> TRYING ENCODING cp1255
10:24:50.209 T:1596   DEBUG: CLEANSUBS >> ENC >> OPENED WITH ENCODING: cp1256
10:24:50.242 T:1596   DEBUG: CLEANSUBS >> SQL ERROR IN AddtoDatabase : no such table: stats
10:24:50.328 T:1596   DEBUG: Previous line repeats 2 times.
10:24:50.328 T:1596   ERROR: EXCEPTION Thrown (PythonToCppException) : -->Python callback/script returned the following error<--
                                             - NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
                                            Error Type: <type 'exceptions.IndexError'>
                                            Error Contents: string index out of range
                                            Traceback (most recent call last):
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py", line 290, in <module>
                                                intCancel = scanPaths(manFolder, 1, 1, 3)
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py", line 176, in scanPaths
                                                process_subs(os.path.join(path, basePath, name), 1)
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\default.py", line 414, in process_subs
                                                if not line.text[subsPos - 1].isalnum() or not line.text[subsPos + 1].isalnum():
                                            IndexError: string index out of range
                                            -->End of Python script error report<--
(2017-03-14, 11:11)HeresJohnny Wrote: [ -> ]Thumbs up, Dalai Lanik :-)
CleanSubs just had a great run, cleaning a third of my subs until it errorred out. I think you're very close to perfection. Here's what I can glean from the normal log, will run some debug tests later to identify the subs which caused this.

Code:
10:05:56.839 T:1652   ERROR: EXCEPTION Thrown (PythonToCppException) : -->Python callback/script returned the following error<--
                                             - NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
                                            Error Type: <type 'exceptions.IndexError'>
                                            Error Contents: string index out of range
                                            Traceback (most recent call last):
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py", line 290, in <module>
                                                intCancel = scanPaths(manFolder, 1, 1, 3)
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py", line 176, in scanPaths
                                                process_subs(os.path.join(path, basePath, name), 1)
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\default.py", line 414, in process_subs
                                                if not line.text[subsPos - 1].isalnum() or not line.text[subsPos + 1].isalnum():
                                            IndexError: string index out of range
                                            -->End of Python script error report<--

OK, I understand this error, shouldn't be a problem to fix... there simply isn't character - or + 1.... back to the drawing board Smile
v5.4
- Bug fix
- Slightly optimized logic of reading definitions
- Progress bar during reading of definitions
So close... with the latest version, Cleansubs parsed half of my folders before it encountered an error

Code:
23:39:19.204 T:9012   ERROR: EXCEPTION Thrown (PythonToCppException) : -->Python callback/script returned the following error<--
                                             - NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
                                            Error Type: <type 'exceptions.IndexError'>
                                            Error Contents: string index out of range
                                            Traceback (most recent call last):
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py", line 289, in <module>
                                                intCancel = scanPaths(manFolder, 1, 1, 3)
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\standalone.py", line 176, in scanPaths
                                                process_subs(os.path.join(path, basePath, name), 1)
                                              File "C:\Users\JoScha\AppData\Roaming\Kodi\addons\service.cleansubs\default.py", line 417, in process_subs
                                                if not line.text[subsPos - 1].isalnum() or not line.text[subsPos + 1].isalnum():
                                            IndexError: string index out of range
                                            -->End of Python script error report<--
Will have a deeper look at a debug log.

UPDATE: The problem here is probably a combination of things. According to the debug log, Cleansub opened the problematic file with CP1256 (arabic) whereas it really is CP932 (Japanese Shift-JIS). Of, course, there would be only garbage to analyze.
Every time I start up Cleansubs it displays on top of the screen a countup in percent till 899. What is that? (cleansubs I see on top then)
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24