Selenium Scraping Javascript Table









up vote
-2
down vote

favorite












I am stuggling to scrape as per code below. Would apprciate it if someone can have a look at what I am missing?
Regards
PyProg70



from selenium import webdriver
from selenium.webdriver import FirefoxOptions
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from bs4 import BeautifulSoup
import pandas as pd
import re, time

binary = FirefoxBinary('/usr/bin/firefox')
opts = FirefoxOptions()
opts.add_argument("--headless")

browser = webdriver.Firefox(options=opts, firefox_binary=binary)
browser.implicitly_wait(10)

url = 'http://tenderbulletin.eskom.co.za/'
browser.get(url)

html = browser.page_source
soup = BeautifulSoup(html, 'lxml')

print(soup.prettify())









share|improve this question









New contributor




PyProg70 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.



















  • idownvotedbecau.se/nomcve
    – Alastair McCormack
    Nov 10 at 13:29














up vote
-2
down vote

favorite












I am stuggling to scrape as per code below. Would apprciate it if someone can have a look at what I am missing?
Regards
PyProg70



from selenium import webdriver
from selenium.webdriver import FirefoxOptions
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from bs4 import BeautifulSoup
import pandas as pd
import re, time

binary = FirefoxBinary('/usr/bin/firefox')
opts = FirefoxOptions()
opts.add_argument("--headless")

browser = webdriver.Firefox(options=opts, firefox_binary=binary)
browser.implicitly_wait(10)

url = 'http://tenderbulletin.eskom.co.za/'
browser.get(url)

html = browser.page_source
soup = BeautifulSoup(html, 'lxml')

print(soup.prettify())









share|improve this question









New contributor




PyProg70 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.



















  • idownvotedbecau.se/nomcve
    – Alastair McCormack
    Nov 10 at 13:29












up vote
-2
down vote

favorite









up vote
-2
down vote

favorite











I am stuggling to scrape as per code below. Would apprciate it if someone can have a look at what I am missing?
Regards
PyProg70



from selenium import webdriver
from selenium.webdriver import FirefoxOptions
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from bs4 import BeautifulSoup
import pandas as pd
import re, time

binary = FirefoxBinary('/usr/bin/firefox')
opts = FirefoxOptions()
opts.add_argument("--headless")

browser = webdriver.Firefox(options=opts, firefox_binary=binary)
browser.implicitly_wait(10)

url = 'http://tenderbulletin.eskom.co.za/'
browser.get(url)

html = browser.page_source
soup = BeautifulSoup(html, 'lxml')

print(soup.prettify())









share|improve this question









New contributor




PyProg70 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I am stuggling to scrape as per code below. Would apprciate it if someone can have a look at what I am missing?
Regards
PyProg70



from selenium import webdriver
from selenium.webdriver import FirefoxOptions
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from bs4 import BeautifulSoup
import pandas as pd
import re, time

binary = FirefoxBinary('/usr/bin/firefox')
opts = FirefoxOptions()
opts.add_argument("--headless")

browser = webdriver.Firefox(options=opts, firefox_binary=binary)
browser.implicitly_wait(10)

url = 'http://tenderbulletin.eskom.co.za/'
browser.get(url)

html = browser.page_source
soup = BeautifulSoup(html, 'lxml')

print(soup.prettify())






python selenium






share|improve this question









New contributor




PyProg70 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




PyProg70 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited Nov 10 at 14:16









ewwink

5,68422232




5,68422232






New contributor




PyProg70 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Nov 10 at 13:25









PyProg70

31




31




New contributor




PyProg70 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





PyProg70 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






PyProg70 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











  • idownvotedbecau.se/nomcve
    – Alastair McCormack
    Nov 10 at 13:29
















  • idownvotedbecau.se/nomcve
    – Alastair McCormack
    Nov 10 at 13:29















idownvotedbecau.se/nomcve
– Alastair McCormack
Nov 10 at 13:29




idownvotedbecau.se/nomcve
– Alastair McCormack
Nov 10 at 13:29












1 Answer
1






active

oldest

votes

















up vote
0
down vote



accepted










not Java but Javascript. it dynamic page you need to wait and check if Ajax finished the request and content rendered using WebDriverWait.



....
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

.....
browser.get(url)

# wait max 30 second until table loaded
WebDriverWait(browser, 30).until(EC.presence_of_element_located((By.CSS_SELECTOR , 'table.CSSTableGenerator .ng-binding')))

html = browser.find_element_by_css_selector('table.CSSTableGenerator')
soup = BeautifulSoup(html.get_attribute("outerHTML"), 'lxml')
print(soup.prettify().encode('utf-8'))





share|improve this answer




















  • Thank you Alastair, this worked perfectly !!
    – PyProg70
    Nov 11 at 5:48










  • Apologies, I mean ewwink....
    – PyProg70
    Nov 11 at 5:57










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);






PyProg70 is a new contributor. Be nice, and check out our Code of Conduct.









 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53239392%2fselenium-scraping-javascript-table%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
0
down vote



accepted










not Java but Javascript. it dynamic page you need to wait and check if Ajax finished the request and content rendered using WebDriverWait.



....
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

.....
browser.get(url)

# wait max 30 second until table loaded
WebDriverWait(browser, 30).until(EC.presence_of_element_located((By.CSS_SELECTOR , 'table.CSSTableGenerator .ng-binding')))

html = browser.find_element_by_css_selector('table.CSSTableGenerator')
soup = BeautifulSoup(html.get_attribute("outerHTML"), 'lxml')
print(soup.prettify().encode('utf-8'))





share|improve this answer




















  • Thank you Alastair, this worked perfectly !!
    – PyProg70
    Nov 11 at 5:48










  • Apologies, I mean ewwink....
    – PyProg70
    Nov 11 at 5:57














up vote
0
down vote



accepted










not Java but Javascript. it dynamic page you need to wait and check if Ajax finished the request and content rendered using WebDriverWait.



....
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

.....
browser.get(url)

# wait max 30 second until table loaded
WebDriverWait(browser, 30).until(EC.presence_of_element_located((By.CSS_SELECTOR , 'table.CSSTableGenerator .ng-binding')))

html = browser.find_element_by_css_selector('table.CSSTableGenerator')
soup = BeautifulSoup(html.get_attribute("outerHTML"), 'lxml')
print(soup.prettify().encode('utf-8'))





share|improve this answer




















  • Thank you Alastair, this worked perfectly !!
    – PyProg70
    Nov 11 at 5:48










  • Apologies, I mean ewwink....
    – PyProg70
    Nov 11 at 5:57












up vote
0
down vote



accepted







up vote
0
down vote



accepted






not Java but Javascript. it dynamic page you need to wait and check if Ajax finished the request and content rendered using WebDriverWait.



....
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

.....
browser.get(url)

# wait max 30 second until table loaded
WebDriverWait(browser, 30).until(EC.presence_of_element_located((By.CSS_SELECTOR , 'table.CSSTableGenerator .ng-binding')))

html = browser.find_element_by_css_selector('table.CSSTableGenerator')
soup = BeautifulSoup(html.get_attribute("outerHTML"), 'lxml')
print(soup.prettify().encode('utf-8'))





share|improve this answer












not Java but Javascript. it dynamic page you need to wait and check if Ajax finished the request and content rendered using WebDriverWait.



....
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

.....
browser.get(url)

# wait max 30 second until table loaded
WebDriverWait(browser, 30).until(EC.presence_of_element_located((By.CSS_SELECTOR , 'table.CSSTableGenerator .ng-binding')))

html = browser.find_element_by_css_selector('table.CSSTableGenerator')
soup = BeautifulSoup(html.get_attribute("outerHTML"), 'lxml')
print(soup.prettify().encode('utf-8'))






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 10 at 14:13









ewwink

5,68422232




5,68422232











  • Thank you Alastair, this worked perfectly !!
    – PyProg70
    Nov 11 at 5:48










  • Apologies, I mean ewwink....
    – PyProg70
    Nov 11 at 5:57
















  • Thank you Alastair, this worked perfectly !!
    – PyProg70
    Nov 11 at 5:48










  • Apologies, I mean ewwink....
    – PyProg70
    Nov 11 at 5:57















Thank you Alastair, this worked perfectly !!
– PyProg70
Nov 11 at 5:48




Thank you Alastair, this worked perfectly !!
– PyProg70
Nov 11 at 5:48












Apologies, I mean ewwink....
– PyProg70
Nov 11 at 5:57




Apologies, I mean ewwink....
– PyProg70
Nov 11 at 5:57










PyProg70 is a new contributor. Be nice, and check out our Code of Conduct.









 

draft saved


draft discarded


















PyProg70 is a new contributor. Be nice, and check out our Code of Conduct.












PyProg70 is a new contributor. Be nice, and check out our Code of Conduct.











PyProg70 is a new contributor. Be nice, and check out our Code of Conduct.













 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53239392%2fselenium-scraping-javascript-table%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

What does pagestruct do in Eviews?

Dutch intervention in Lombok and Karangasem

Channel Islands