2014-11-22, 07:50
Some very quick regex profiling analysis (see here for details) - this is just for entering the BBC One catchup channel:
On the face of it, regex parsing doesn't seem to be the problem (in total, it accounts for only 0.6646 seconds of processing time) so the problem is elsewhere, presumably the XML parsing but maybe some other list processing.
For now I'll stick with 2.6.6, but always happy to test out any other version you may have.
Code:
Method Freq regex vs. re Avg +/- us | re (min/max/avg/total)
c.compile 368 0 vs 368 1269.3625 | 63.8962 / 74696.0640 / 1269.3625 / 0.4671s
c.findall 674 0 vs 674 101.5341 | 56.9820 / 3571.9872 / 101.5341 / 0.0684s
c.match 1012 0 vs 1012 73.1910 | 39.1006 / 4981.0410 / 73.1910 / 0.0741s
c.search 337 0 vs 337 80.8911 | 30.9944 / 3283.0238 / 80.8911 / 0.0273s
d.sub 2 0 vs 2 14349.4606 | 4667.0437 / 24031.8775 / 14349.4606 / 0.0287s
============================================================================================================
TOTAL 2393 0 vs 2393 0.6656s | 0.6656s
ELAPSED TIME less re : 34.9420s
ELAPSED TIME less regex: 35.6076s
ELAPSED TIME TOTAL : 35.6076s
PERF LOGGING OVERHEAD : 2.8955s (included in above elapsed times)
On the face of it, regex parsing doesn't seem to be the problem (in total, it accounts for only 0.6646 seconds of processing time) so the problem is elsewhere, presumably the XML parsing but maybe some other list processing.
For now I'll stick with 2.6.6, but always happy to test out any other version you may have.