79313273

Date: 2024-12-28 05:39:55
Score: 1
Natty:
Report link

I couldn't figure out how to use selenium, so I just used regex on the page's source code. This gets everything I want asides from the odds.

from urllib import request
import re
import pandas as pd

response = request.urlopen("https://ai-goalie.com/index.html")
# set the correct charset below
page_source = response.read().decode('utf-8')

data_league = re.findall("(?<=data-league=\")[^\"]*", page_source)
data_home = re.findall("(?<=data-home=\")[^\"]*", page_source)
data_away = re.findall("(?<=data-away=\")[^\"]*", page_source)
data_time = re.findall("(?<=data-time=\")[^\"]*", page_source)
data_date = re.findall("(?<=data-date=\")[^\"]*", page_source)
certainty = re.findall("(?<=\"certainty\">)[^<]*", page_source)
probability = re.findall("\d+%(?=\n)", page_source)

df = pd.DataFrame({'league': data_league, 'home_team': data_away, 'away_team': data_away, 'time': data_time, 'date': data_date, 'certainty': certainty, 'probability': probability})
Reasons:
  • RegEx Blacklisted phrase (1): I want
  • Long answer (-0.5):
  • Has code block (-0.5):
  • Self-answer (0.5):
  • Low reputation (0.5):
Posted by: blahahaaahahaa