2021-11-09, 19:24
Before I start, this does not give users a browser gui or any kind, but is an addon for developers to use Chromium browser to web scrape.
What is it?:
This addon uses system docker to spawn a Chromium process with gui enabled. Even though gui is enabled inside the docker, your host OS does not need to have a working X11 or any kind of redirection, because the GUI of chromium is running inside Xvfb (Virtual X framebuffer). The spawned Chromium process is enabled with the CDP (Chromium Developer Protocol? or sth like that), so that, and third party client can conenct with websockets and control the browser according to your needs.
How can i use it?:
Depend the addon in addon xml and use following syntax:
Why would i use it?:
I started this to overcome Cloudflare IUAM mode when using TOR. Unlike normal browsing, TOR browsing is close to impossible thanks to Cloudflare is being a faschist against the requests come from TOR network, and always utilize the Javascript bypass page, and this bypass mechanism is extremely dynamic, complicated, and frequently updated. Therefore i gave up on pure python bypass technics and use this instead, and so far works like charm. And besides, today web sites heavily depend on client side scripting therefore this is kind of a must to scrape stuff from the web.
Which platforms are supported?:
Linux, only linux. I have tested on my Mint Desktop and Corelec STB and everthing is smooth there. It is written PY2 + PY3 compatible so works on Kodi 18 & 19 boh from the same repo. Architecturally, i have infrastructure that can support even from XBMC 14 gotham but dont go that far, i have never test lover than 18.
This system is using Xvfb which is a linux component, so it wont scale to windows or Android. May be another method can considered for Windows but there is no chance in Androids. At least i dont really care at.
PS: on linux make sure the user that spawns the kodi process has rights to execute docker commands, the service basically automates Docker, so there must a a working docker to sue this. On corelec/libreelec it is based on service.system.docker which comes with the distro, so you dont need to do much.
Advanced usage:
The browser instance has already implemented a websocket connection management, and primitives to execute the CDP commands, so instead of directly using .navigate method you can interact with CDP according to your wish.
Sources:
The code lies below:
https://github.com/hbiyik/repository.biy...e.chromium
The docker files:
https://github.com/hbiyik/docker-xvfb-chromium
The repo itself:
https://github.com/hbiyik/repository.biyik/releases
I highly suggest to use the repo, if you want to do development, there are dependencies for this script which is hard to manage manually.
Todo:
Known Issues:
On Libre/Corelec A dialog keeps popping up sayin "xvfb-chromium:kill" however does not really kill, This does not impact the operation but gets kind of boring when it keeps spamming the beeps. I ll look into it. Seems like there is another service monitoring all Libreelec Dockers.
What is it?:
This addon uses system docker to spawn a Chromium process with gui enabled. Even though gui is enabled inside the docker, your host OS does not need to have a working X11 or any kind of redirection, because the GUI of chromium is running inside Xvfb (Virtual X framebuffer). The spawned Chromium process is enabled with the CDP (Chromium Developer Protocol? or sth like that), so that, and third party client can conenct with websockets and control the browser according to your needs.
How can i use it?:
Depend the addon in addon xml and use following syntax:
python:
from chromium import Browser # import the library
useragent = None # String value to spoof User-Agent, dont spoof it unless you know what you are doing
loadtimeout = 5 # Maximum time to wait in seconds for Document to load dynamically (time needed for JS or client side scripts to finish manipulating the DOM)
maxtimeout = 15 # Maximum time to wait in seconds, if this times out, tab is closed and return value is what is already received
port = 9222 # the cdp port to conenct to, by default the service is running on 127.0.0.1:9222, so changing his value will not take affect since the service is hardcoded run on 9222, u can change it to connect another instance on your PC for debugging.
# The Browser arguments are default to above
with Browser(useragent, loadtimeout, maxtimeout,port) as browser:
page = browser.navigate("https://www.google.com")
#the return value is unicode HTML data
# so simply use:
with Browser() as browser:
page = browser.navigate("https://www.google.com")
Why would i use it?:
I started this to overcome Cloudflare IUAM mode when using TOR. Unlike normal browsing, TOR browsing is close to impossible thanks to Cloudflare is being a faschist against the requests come from TOR network, and always utilize the Javascript bypass page, and this bypass mechanism is extremely dynamic, complicated, and frequently updated. Therefore i gave up on pure python bypass technics and use this instead, and so far works like charm. And besides, today web sites heavily depend on client side scripting therefore this is kind of a must to scrape stuff from the web.
Which platforms are supported?:
Linux, only linux. I have tested on my Mint Desktop and Corelec STB and everthing is smooth there. It is written PY2 + PY3 compatible so works on Kodi 18 & 19 boh from the same repo. Architecturally, i have infrastructure that can support even from XBMC 14 gotham but dont go that far, i have never test lover than 18.
This system is using Xvfb which is a linux component, so it wont scale to windows or Android. May be another method can considered for Windows but there is no chance in Androids. At least i dont really care at.
PS: on linux make sure the user that spawns the kodi process has rights to execute docker commands, the service basically automates Docker, so there must a a working docker to sue this. On corelec/libreelec it is based on service.system.docker which comes with the distro, so you dont need to do much.
Advanced usage:
The browser instance has already implemented a websocket connection management, and primitives to execute the CDP commands, so instead of directly using .navigate method you can interact with CDP according to your wish.
Sources:
The code lies below:
https://github.com/hbiyik/repository.biy...e.chromium
The docker files:
https://github.com/hbiyik/docker-xvfb-chromium
The repo itself:
https://github.com/hbiyik/repository.biyik/releases
I highly suggest to use the repo, if you want to do development, there are dependencies for this script which is hard to manage manually.
Todo:
- Inroduce a 'validation' callback method as an input to navigate method, so that the client will wait until the DOM response is expected, instead of listening DOM events with timer done in 0.0.2
- Make the script as program and utilize the settings.xml
- Make docker settings customizable
- Make the port customizable? Who needs that? Dunno
Known Issues:
On Libre/Corelec A dialog keeps popping up sayin "xvfb-chromium:kill" however does not really kill, This does not impact the operation but gets kind of boring when it keeps spamming the beeps. I ll look into it. Seems like there is another service monitoring all Libreelec Dockers.