Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

pyppeteer: page.click and page.waitForNavigation issues while going through multiple pages

Ask Question

I'm learning puppeteer in JavaScript and following a book and some documentation and tutorials found online. I found a good tutorial going through multiple pages of a famous online shop and saving the items in a file. The JavaScript code I wrote following this tutorial, changing what had to be changed, is working well. The problem is with my Python porting using pyppeteer

I had the issue described here https://github.com/miyakogi/pyppeteer/issues/58 and applied the solution in the following code

import asyncio, json
from pyppeteer import launch
async def main():   
    browser = await launch(headless = False, defaultViewport = False)
    page = await browser.newPage()
    await page.goto(
        "https://shop_site_link",
        'waitUntil': "load"
    items = []
    item_keys = ['title','price','img']
    isBtnDisabled = False
    while (not isBtnDisabled):
        await page.waitForSelector('[data-cel-widget="search_result_0"]')
        ProductHandles = await page.querySelectorAll( 
            "div.s-main-slot.s-result-list.s-search-results.sg-row > .s-result-item"
        )#this replace page.$$( "div.s-main-slot.s-result-list.s-search-results.sg-row > .s-result-item");
        for producthandle in ProductHandles:
            title = None
            price = None
            img  =  None
                title = await page.evaluate('''
                el => el.querySelector("h2 > a > span").textContent
                ''', producthandle)
            except:
                print('some error')
                price = await page.evaluate('''
                el => el.querySelector(".a-price > .a-offscreen").textContent
                ''', producthandle)
            except:
                print('some error')  
                img = await page.evaluate('''
                el => el.querySelector(".s-image").getAttribute("src")
                ''', producthandle)
            except:
                print('some error')             
            if (title is not None):
                items.append(dict(zip(item_keys, [title, price, img])))
        is_disabled =  await page.querySelector('.s-pagination-item.s-pagination-next.s-pagination-disabled')!=None
        isBtnDisabled = is_disabled;
        if (not is_disabled):
            await asyncio.wait([
                page.click(".s-pagination-next"),
                page.waitForSelector(".s-pagination-next", { 'visible': True }),
                page.waitForNavigation({'waitUntil' : "networkidle2"},timeout=15000)
    #await browser.close()
    print(len(items))
    with open('items.json', 'w') as f:
        json.dump(items, f, indent = 2)
    # with open('items.json', 'r') as readfile:
    #     print(json.load(readfile))
asyncio.get_event_loop().run_until_complete(main())   

as per issue described in pyppeteer github I issued the page.click and page.waitForNavigation at "same time" this way

        if (not is_disabled):
        await asyncio.wait([
            page.click(".s-pagination-next"),
            page.waitForSelector(".s-pagination-next", { 'visible': True }),
            page.waitForNavigation({'waitUntil' : "networkidle2"},timeout=15000)

trying to do what I do in the JavaScript code here:

if (!is_disabled) {
    await Promise.all([
        page.click(".s-pagination-next"),
        page.waitForNavigation({ waitUntil: "networkidle2" }),

Now, the issue and related question is, the code works well but I receive the following warning:

DeprecationWarning: The explicit passing of coroutine objects to asyncio.wait() is deprecated since Python 3.8, and scheduled for removal in Python 3.11.

anyone knows a better implementation that will work well with Python 3.11?

at the end I found a solution that maybe useful for whomever will have the same issue

        if (not is_disabled):
            if sys.version_info < (3, 11):
                await asyncio.wait([
                    page.click(".s-pagination-next"),
                    page.waitForSelector(".s-pagination-next", { 'visible': True }),
                    page.waitForNavigation({'waitUntil' : "networkidle2"},timeout=15000)
            else:
                async with  asyncio.TaskGroup() as tg:
                    task1 = tg.create_task(page.click(".s-pagination-next"))
                    task2 = tg.create_task(page.waitForNavigation({'waitUntil' : "networkidle2"},timeout=15000))

so basically to use TaskGroup in order to wait to all tasks to complete

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.