Login at Kodi Home

antrrax · (This post was last modified: 2017-04-19, 06:49 by antrrax.)

Reformulating the topic not to violate the rules of the forum:

How to get html source code on sites that require Javascript without using Selenium WebDriver? Is there any other way?

Skipmode A1 Wrote:You can run javascript with nodejs: see here https://github.com/Anorov/cloudflare-scrape

This however requires nodejs present on the computer you running your kodi-add on.

I use Windows, and I installed cfscrape in python 2.7:

Code:
python -m pip install -U pip cfscrape

And installed Node.JS

Code:
https://nodejs.org/dist/v7.9.0/node-v7.9.0-x86.msi

I'm trying to get the source code of the site where javascript is required, eg sites hosted on Cloudflare
first test in Python IDLE (Windows):

Code:
import cfscrape

scraper = cfscrape.create_scraper()

print scraper.get('http://somesite.com.Cloudflare-anti-bot').content

But it still did not open Javascript. Could you give me an example of how to use cfscrape with nodejs?

Thanks

**Skipmode A1** · 2017-04-19, 10:50

Here some code that seemed to work when i was messing with it a while back:

Code:
# Make a session

        sess = requests.session()

        # Set cookies for cookie-firewall and nsfw-switch

        if SETTINGS.getSetting('nsfw') == 'true':

            cookies = {"Cookie": "cpc=10", "nsfw": "1"}

        else:

            cookies = {"Cookie": "cpc=10"}

        # Determine if cloudflare protection is active or not

        html_source = sess.get(self.video_list_page_url, cookies=cookies).text

        if str(html_source).find("cloudflare") >= 0:

            cloudflare_active = True

        else:

            cloudflare_active = False

        # Get the page

        if cloudflare_active == True:

            try:

                import cfscrape

            except:

                xbmcgui.Dialog().ok(LANGUAGE(30000), LANGUAGE(30513))

                sys.exit(1)

            try:

                # returns a CloudflareScraper instance

                scraper = cfscrape.create_scraper(sess)

            except:

                xbmcgui.Dialog().ok(LANGUAGE(30000), LANGUAGE(30514))

                sys.exit(1)

            try:

                html_source = scraper.get(self.video_list_page_url).content

            except:

                xbmcgui.Dialog().ok(LANGUAGE(30000), LANGUAGE(30515))

                sys.exit(1)

        # Parse response

        soup = BeautifulSoup(html_source)

pinoytracker · 2017-04-20, 10:04

any update on this, Thanks

Franc · 2018-06-03, 22:50

How fast this method is?
I've tried something like this, but it's slow. After 25-30 seconds I get the HTML content.
(Cloudfare protection)

Code:
import cfscrape

scraper = cfscrape.create_scraper()

print scraper.get("http://somesite.com").content