79673869

Date: 2025-06-20 19:21:33
Score: 5
Natty:
Report link

I may not have been clear, even though the responses have been helpful.

Rather than delving into the world of thread programming if I didn't have to, I did some profiling. Here are the approximate timings:

p = sync_playwright().start()   # .4 seconds  
browser = p.firefox.launch()    # .8 seconds  
page = browser.new_page()       # .9 seconds  
page.goto(url)                  # 2.5 - 3.2 seconds  

So, the start-up overhead is about 40% of the full request time. Definitely worth trying to optimize.

It looks like I want to put:

p = sync_playwright().start()   
browser = p.firefox.launch()   
page = browser.new_page()  

in the "parent" thread, but make p, browser, and page available to each "child" thread.

So, I changed my code to look like:

thread_data = threading.local()
thread_data.p = sync_playwright().start()

@app.route('/fetch/')
def fetch_url():
    url = request.args.get('url')
    return fetch(url)


def fetch (url, thread_data=thread_data):
    p = thread_data.p
    browser = p.firefox.launch() 

....


When I run this, I get:

  File "/home/mdiehl/Development///Checker/./app.py", line 27, in fetch
    p = thread_data.p
        ^^^^^^^^^^^^^
AttributeError: '_thread._local' object has no attribute 'p'

So, I guess I don't understand how threading works in Python.  (I've done it in Perl and C)

Any further help would be appreciated.

Mike.
Reasons:
  • Blacklisted phrase (1): appreciated
  • RegEx Blacklisted phrase (3): help would be appreciated
  • RegEx Blacklisted phrase (1): I want
  • Long answer (-1):
  • Has code block (-0.5):
  • Self-answer (0.5):
  • Low reputation (1):
Posted by: Mike Diehl