API Reference¶
Commands¶
pyppeteer-install
: Download and install chromium for pyppeteer.
Environment Variables¶
$PYPPETEER_HOME
: Specify the directory to be used by pyppeteer. Pyppeteer uses this directory for extracting downloaded Chromium, and for making temporary user data directory. Default location depends on platform:Windows:
C:\Users\<username>\AppData\Local\pyppeteer
OS X:
/Users/<username>/Library/Application Support/pyppeteer
Linux:
/home/<username>/.local/share/pyppeteer
or in
$XDG_DATA_HOME/pyppeteer
if$XDG_DATA_HOME
is defined.
Details see appdirs‘s
user_data_dir
.$PYPPETEER_DOWNLOAD_HOST
: Overwrite host part of URL that is used to download Chromium. Defaults tohttps://storage.googleapis.com
.$PYPPETEER_CHROMIUM_REVISION
: Specify a certain version of chromium you’d like pyppeteer to use. Default value can be checked bypyppeteer.__chromium_revision__
.$PYPPETEER_NO_PROGRESS_BAR
: Suppress showing progress bar in chromium download process. Acceptable values are1
ortrue
(case-insensitive).
Pyppeteer Main Module¶
- async pyppeteer.launch(options: dict = None, **kwargs: Any) Browser [source]¶
Start chrome process and return
Browser
. This function is a shortcut toLauncher(options, **kwargs).launch()
. Available options are: *ignoreHTTPSErrors
(bool): Whether to ignore HTTPS errors. Defaults toFalse
.headless
(bool): Whether to run browser in headless mode. Defaults toTrue
unlessappMode
ordevtools
options isTrue
.executablePath
(str): Path to a Chromium or Chrome executable to run instead of default bundled Chromium.slowMo
(int|float): Slow down pyppeteer operations by the specified amount of milliseconds.defaultViewport
(dict): Set a consistent viewport for each page. Defaults to an 800x600 viewport.None
disables default viewport. *width
(int): page width in pixels. *height
(int): page height in pixels. *deviceScaleFactor
(int|float): Specify device scale factor (can bethought as dpr). Defaults to
1
.isMobile
(bool): Whether themeta viewport
tag is taken into account. Defaults toFalse
.hasTouch
(bool): Specify if viewport supports touch events. Defaults toFalse
.isLandscape
(bool): Specify if viewport is in landscape mode. Defaults toFalse
.
args
(List[str]): Additional arguments (flags) to pass to the browser process.ignoreDefaultArgs
(bool or List[str]): IfTrue
, do not usedefaultArgs()
. If list is given, then filter out given default arguments. Dangerous option; use with care. Defaults toFalse
.handleSIGINT
(bool): Close the browser process on Ctrl+C. Defaults toTrue
.handleSIGTERM
(bool): Close the browser process on SIGTERM. Defaults toTrue
.handleSIGHUP
(bool): Close the browser process on SIGHUP. Defaults toTrue
.dumpio
(bool): Whether to pipe the browser process stdout and stderr intoprocess.stdout
andprocess.stderr
. Defaults toFalse
.userDataDir
(str): Path to a user data directory.env
(dict): Specify environment variables that will be visible to the browser. Defaults to same as python process.devtools
(bool): Whether to auto-open a DevTools panel for each tab. If this option isTrue
, theheadless
option will be setFalse
.logLevel
(int|str): Log level to print logs. Defaults to same as the root logger.autoClose
(bool): Automatically close browser process when script completed. Defaults toTrue
.loop
(asyncio.AbstractEventLoop): Event loop (experimental).appMode
(bool): Deprecated.
This function combines 3 steps: 1. Infer a set of flags to launch chromium with using
Launch browser and start managing its process according to the
executablePath
,handleSIGINT
,dumpio
, and other options.Create an instance of
Browser
class and initialize it withdefaultViewport
,slowMo
, andignoreHTTPSErrors
.
ignoreDefaultArgs
option can be used to customize behavior on the (1) step. For example, to filter out--mute-audio
from default arguments: .. code:browser = await launch(ignoreDefaultArgs=['--mute-audio'])
Note
Pyppeteer can also be used to control the Chrome browser, but it works best with the version of Chromium it is bundled with. There is no guarantee it will work with any other version. Use
executablePath
option with extreme caution.
- async pyppeteer.connect(options: dict = None, **kwargs: Any) Browser [source]¶
Connect to the existing chrome.
browserWSEndpoint
orbrowserURL
option is necessary to connect to the chrome. The format ofbrowserWSEndpoint
isws://${host}:${port}/devtools/browser/<id>
and format ofbrowserURL
ishttp://127.0.0.1:9222`
. The value ofbrowserWSEndpoint
can get bywsEndpoint
. Available options are: *browserWSEndpoint
(str): A browser websocket endpoint to connect to. *browserURL
(str): A browser URL to connect to. *ignoreHTTPSErrors
(bool): Whether to ignore HTTPS errors. Defaults toFalse
.defaultViewport
(dict): Set a consistent viewport for each page. Defaults to an 800x600 viewport.None
disables default viewport. *width
(int): page width in pixels. *height
(int): page height in pixels. *deviceScaleFactor
(int|float): Specify device scale factor (can bethought as dpr). Defaults to
1
.isMobile
(bool): Whether themeta viewport
tag is taken into account. Defaults toFalse
.hasTouch
(bool): Specify if viewport supports touch events. Defaults toFalse
.isLandscape
(bool): Specify if viewport is in landscape mode. Defaults toFalse
.
slowMo
(int|float): Slow down pyppeteer’s by the specified amount of milliseconds.logLevel
(int|str): Log level to print logs. Defaults to same as the root logger.loop
(asyncio.AbstractEventLoop): Event loop (experimental).
- pyppeteer.defaultArgs(options: Dict = None, **kwargs: Any) List[str] [source]¶
Get the default flags the chromium will be launched with.
options
or keyword arguments are set of configurable options to set on the browser. Can have the following fields: *headless
(bool): Whether to run browser in headless mode. Defaults toTrue
unless thedevtools
option isTrue
.args
(List[str]): Additional arguments to pass to the browser instance. The list of chromium flags can be found here.userDataDir
(str): Path to a User Data Directory.devtools
(bool): Whether to auto-open DevTools panel for each tab. If this option isTrue
, theheadless
option will be setFalse
.
Browser Class¶
- class pyppeteer.browser.Browser(connection: Connection, contextIds: List[str], ignoreHTTPSErrors: bool, defaultViewport: Optional[Dict], process: Optional[Popen] = None, closeCallback: Callable[[], Awaitable[None]] = None, **kwargs: Any)[source]¶
Browser class.
A Browser object is created when pyppeteer connects to chrome, either through
launch()
orconnect()
.- property browserContexts: List[BrowserContext]¶
Return a list of all open browser contexts.
In a newly created browser, this will return a single instance of
[BrowserContext]
- coroutine createIncogniteBrowserContext() BrowserContext [source]¶
[Deprecated] Miss spelled method.
Use
createIncognitoBrowserContext()
method instead.
- coroutine createIncognitoBrowserContext() BrowserContext [source]¶
Create a new incognito browser context.
This won’t share cookies/cache with other browser contexts.
browser = await launch() # Create a new incognito browser context. context = await browser.createIncognitoBrowserContext() # Create a new page in a pristine context. page = await context.newPage() # Do stuff await page.goto('https://example.com') ...
- coroutine pages() List[Page] [source]¶
Get all pages of this browser.
Non visible pages, such as
"background_page"
, will not be listed here. You can find then usingpyppeteer.target.Target.page()
.In case of multiple browser contexts, this method will return a list with all the pages in all browser contexts.
- property process: Optional[Popen]¶
Return process of this browser.
If browser instance is created by
pyppeteer.launcher.connect()
, returnNone
.
- targets() List[Target] [source]¶
Get a list of all active targets inside the browser.
In case of multiple browser contexts, the method will return a list with all the targets in all browser contexts.
- coroutine userAgent() str [source]¶
Return browser’s original user agent.
Note
Pages can override browser user agent with
pyppeteer.page.Page.setUserAgent()
.
- property wsEndpoint: str¶
Return websocket end point url.
BrowserContext Class¶
- class pyppeteer.browser.BrowserContext(browser: Browser, contextId: Optional[str])[source]¶
BrowserContext provides multiple independent browser sessions.
When a browser is launched, it has a single BrowserContext used by default. The method
browser.newPage()
creates a page in the default browser context.If a page opens another page, e.g. with a
window.open
call, the popup will belong to the parent page’s browser context.Pyppeteer allows creation of “incognito” browser context with
browser.createIncognitoBrowserContext()
method. “incognito” browser contexts don’t write any browser data to disk.# Create new incognito browser context context = await browser.createIncognitoBrowserContext() # Create a new page inside context page = await context.newPage() # ... do stuff with page ... await page.goto('https://example.com') # Dispose context once it's no longer needed await context.close()
- coroutine close() None [source]¶
Close the browser context.
All the targets that belongs to the browser context will be closed.
Note
Only incognito browser context can be closed.
- isIncognite() bool [source]¶
[Deprecated] Miss spelled method.
Use
isIncognito()
method instead.
- isIncognito() bool [source]¶
Return whether BrowserContext is incognito.
The default browser context is the only non-incognito browser context.
Note
The default browser context cannot be closed.
- coroutine pages() List[Page] [source]¶
Return list of all open pages.
Non-visible pages, such as
"background_page"
, will not be listed here. You can find them usingpyppeteer.target.Target.page()
.
Page Class¶
- class pyppeteer.page.Page(client: CDPSession, target: Target, frameTree: Dict, ignoreHTTPSErrors: bool, screenshotTaskQueue: list = None)[source]¶
Page class.
This class provides methods to interact with a single tab of chrome. One
Browser
object might have multiple Page object.The
Page
class emits variousEvents
which can be handled by usingon
oronce
method, which is inherited from pyee’sEventEmitter
class.- Events = namespace(Close='close', Console='console', Dialog='dialog', DOMContentLoaded='domcontentloaded', Error='error', PageError='pageerror', Request='request', Response='response', RequestFailed='requestfailed', RequestFinished='requestfinished', FrameAttached='frameattached', FrameDetached='framedetached', FrameNavigated='framenavigated', Load='load', Metrics='metrics', WorkerCreated='workercreated', WorkerDestroyed='workerdestroyed')¶
Available events.
- coroutine J(selector: str) Optional[ElementHandle] ¶
alias to
querySelector()
- coroutine JJ(selector: str) List[ElementHandle] ¶
alias to
querySelectorAll()
- coroutine JJeval(selector: str, pageFunction: str, *args: Any) Any ¶
alias to
querySelectorAllEval()
- coroutine Jeval(selector: str, pageFunction: str, *args: Any) Any ¶
alias to
querySelectorEval()
- coroutine Jx(expression: str) List[ElementHandle] ¶
alias to
xpath()
- coroutine addScriptTag(options: Dict = None, **kwargs: str) ElementHandle [source]¶
Add script tag to this page.
- One of
url
,path
orcontent
option is necessary. url
(string): URL of a script to add.path
(string): Path to the local JavaScript file to add.content
(string): JavaScript string to add.type
(string): Script type. Usemodule
in order to load a JavaScript ES6 module.
- Return ElementHandle:
ElementHandle
of added tag.
- One of
- coroutine addStyleTag(options: Dict = None, **kwargs: str) ElementHandle [source]¶
Add style or link tag to this page.
- One of
url
,path
orcontent
option is necessary. url
(string): URL of the link tag to add.path
(string): Path to the local CSS file to add.content
(string): CSS string to add.
- Return ElementHandle:
ElementHandle
of added tag.
- One of
- coroutine authenticate(credentials: Dict[str, str]) Any [source]¶
Provide credentials for http authentication.
credentials
should beNone
or dict which hasusername
andpassword
field.
- coroutine click(selector: str, options: dict = None, **kwargs: Any) None [source]¶
Click element which matches
selector
.This method fetches an element with
selector
, scrolls it into view if needed, and then usesmouse
to click in the center of the element. If there’s no element matchingselector
, the method raisesPageError
.Available options are:
button
(str):left
,right
, ormiddle
, defaults toleft
.clickCount
(int): defaults to 1.delay
(int|float): Time to wait betweenmousedown
andmouseup
in milliseconds. defaults to 0.
Note
If this method triggers a navigation event and there’s a separate
waitForNavigation()
, you may end up with a race condition that yields unexpected results. The correct pattern for click and wait for navigation is the following:await asyncio.gather( page.waitForNavigation(waitOptions), page.click(selector, clickOptions), )
- coroutine close(options: Dict = None, **kwargs: Any) None [source]¶
Close this page.
Available options:
runBeforeUnload
(bool): Defaults toFalse
. Whether to run the before unload page handlers.
By defaults,
close()
does not run beforeunload handlers.Note
If
runBeforeUnload
is passed asTrue
, abeforeunload
dialog might be summoned and should be handled manually via page’sdialog
event.
- coroutine content() str [source]¶
Get the full HTML contents of the page.
Returns HTML including the doctype.
- coroutine cookies(*urls: str) List[Dict[str, Union[str, int, bool]]] [source]¶
Get cookies.
If no URLs are specified, this method returns cookies for the current page URL. If URLs are specified, only cookies for those URLs are returned.
Returned cookies are list of dictionaries which contain these fields:
name
(str)value
(str)url
(str)domain
(str)path
(str)expires
(number): Unix time in secondshttpOnly
(bool)secure
(bool)session
(bool)sameSite
(str):'Strict'
or'Lax'
- coroutine deleteCookie(*cookies: dict) None [source]¶
Delete cookie.
cookies
should be dictionaries which contain these fields:name
(str): requiredurl
(str)domain
(str)path
(str)secure
(bool)
- coroutine emulate(options: dict = None, **kwargs: Any) None [source]¶
Emulate given device metrics and user agent.
This method is a shortcut for calling two methods:
options
is a dictionary containing these fields:viewport
(dict)width
(int): page width in pixels.height
(int): page width in pixels.deviceScaleFactor
(float): Specify device scale factor (can be thought as dpr). Defaults to 1.isMobile
(bool): Whether themeta viewport
tag is taken into account. Defaults toFalse
.hasTouch
(bool): Specifies if viewport supports touch events. Defaults toFalse
.isLandscape
(bool): Specifies if viewport is in landscape mode. Defaults toFalse
.
userAgent
(str): user agent string.
- coroutine emulateMedia(mediaType: str = None) None [source]¶
Emulate css media type of the page.
- Parameters:
mediaType (str) – Changes the CSS media type of the page. The only allowed values are
'screen'
,'print'
, andNone
. PassingNone
disables media emulation.
- coroutine evaluate(pageFunction: str, *args: Any, force_expr: bool = False) Any [source]¶
Execute js-function or js-expression on browser and get result.
- Parameters:
pageFunction (str) – String of js-function/expression to be executed on the browser.
force_expr (bool) – If True, evaluate
pageFunction
as expression. If False (default), try to automatically detect function or expression.
note:
force_expr
option is a keyword only argument.
- coroutine evaluateHandle(pageFunction: str, *args: Any) JSHandle [source]¶
Execute function on this page.
Difference between
evaluate()
andevaluateHandle()
is thatevaluateHandle
returns JSHandle object (not value).- Parameters:
pageFunction (str) – JavaScript function to be executed.
- coroutine evaluateOnNewDocument(pageFunction: str, *args: str) None [source]¶
Add a JavaScript function to the document.
This function would be invoked in one of the following scenarios:
whenever the page is navigated
whenever the child frame is attached or navigated. In this case, the function is invoked in the context of the newly attached frame.
- coroutine exposeFunction(name: str, pyppeteerFunction: Callable[[…], Any]) None [source]¶
Add python function to the browser’s
window
object asname
.Registered function can be called from chrome process.
- Parameters:
name (string) – Name of the function on the window object.
pyppeteerFunction (Callable) – Function which will be called on python process. This function should not be asynchronous function.
- coroutine focus(selector: str) None [source]¶
Focus the element which matches
selector
.If no element matched the
selector
, raisePageError
.
- coroutine goBack(options: dict = None, **kwargs: Any) Optional[Response] [source]¶
Navigate to the previous page in history.
Available options are same as
goto()
method.If cannot go back, return
None
.
- coroutine goForward(options: dict = None, **kwargs: Any) Optional[Response] [source]¶
Navigate to the next page in history.
Available options are same as
goto()
method.If cannot go forward, return
None
.
- coroutine goto(url: str, options: dict = None, **kwargs: Any) Optional[Response] [source]¶
Go to the
url
.- Parameters:
url (string) – URL to navigate page to. The url should include scheme, e.g.
https://
.
Available options are:
timeout
(int): Maximum navigation time in milliseconds, defaults to 30 seconds, pass0
to disable timeout. The default value can be changed by using thesetDefaultNavigationTimeout()
method.waitUntil
(str|List[str]): When to consider navigation succeeded, defaults toload
. Given a list of event strings, navigation is considered to be successful after all events have been fired. Events can be either:load
: whenload
event is fired.domcontentloaded
: when theDOMContentLoaded
event is fired.networkidle0
: when there are no more than 0 network connections for at least 500 ms.networkidle2
: when there are no more than 2 network connections for at least 500 ms.
The
Page.goto
will raise errors if:there’s an SSL error (e.g. in case of self-signed certificates)
target URL is invalid
the
timeout
is exceeded during navigationthen main resource failed to load
Note
goto()
either raise error or return a main resource response. The only exceptions are navigation toabout:blank
or navigation to the same URL with a different hash, which would succeed and returnNone
.Note
Headless mode doesn’t support navigation to a PDF document.
- coroutine hover(selector: str) None [source]¶
Mouse hover the element which matches
selector
.If no element matched the
selector
, raisePageError
.
- coroutine injectFile(filePath: str) str [source]¶
[Deprecated] Inject file to this page.
This method is deprecated. Use
addScriptTag()
instead.
- coroutine metrics() Dict[str, Any] [source]¶
Get metrics.
Returns dictionary containing metrics as key/value pairs:
Timestamp
(number): The timestamp when the metrics sample was taken.Documents
(int): Number of documents in the page.Frames
(int): Number of frames in the page.JSEventListeners
(int): Number of events in the page.Nodes
(int): Number of DOM nodes in the page.LayoutCount
(int): Total number of full partial page layout.RecalcStyleCount
(int): Total number of page style recalculations.LayoutDuration
(int): Combined duration of page duration.RecalcStyleDuration
(int): Combined duration of all page style recalculations.ScriptDuration
(int): Combined duration of JavaScript execution.TaskDuration
(int): Combined duration of all tasks performed by the browser.JSHeapUsedSize
(float): Used JavaScript heap size.JSHeapTotalSize
(float): Total JavaScript heap size.
- coroutine pdf(options: dict = None, **kwargs: Any) bytes [source]¶
Generate a pdf of the page.
Options:
path
(str): The file path to save the PDF.scale
(float): Scale of the webpage rendering, defaults to1
.displayHeaderFooter
(bool): Display header and footer. Defaults toFalse
.headerTemplate
(str): HTML template for the print header. Should be valid HTML markup with following classes.date
: formatted print datetitle
: document titleurl
: document locationpageNumber
: current page numbertotalPages
: total pages in the document
footerTemplate
(str): HTML template for the print footer. Should use the same template asheaderTemplate
.printBackground
(bool): Print background graphics. Defaults toFalse
.landscape
(bool): Paper orientation. Defaults toFalse
.pageRanges
(string): Paper ranges to print, e.g., ‘1-5,8,11-13’. Defaults to empty string, which means all pages.format
(str): Paper format. If set, takes priority overwidth
orheight
. Defaults toLetter
.width
(str): Paper width, accepts values labeled with units.height
(str): Paper height, accepts values labeled with units.margin
(dict): Paper margins, defaults toNone
.top
(str): Top margin, accepts values labeled with units.right
(str): Right margin, accepts values labeled with units.bottom
(str): Bottom margin, accepts values labeled with units.left
(str): Left margin, accepts values labeled with units.
preferCSSPageSize
: Give any CSS@page
size declared in the page priority over what is declared inwidth
andheight
orformat
options. Defaults toFalse
, which will scale the content to fit the paper size.
- Returns:
Return generated PDF
bytes
object.
Note
Generating a pdf is currently only supported in headless mode.
pdf()
generates a pdf of the page withprint
css media. To generate a pdf withscreen
media, callpage.emulateMedia('screen')
before callingpdf()
.Note
By default,
pdf()
generates a pdf with modified colors for printing. Use the--webkit-print-color-adjust
property to force rendering of exact colors.await page.emulateMedia('screen') await page.pdf({'path': 'page.pdf'})
The
width
,height
, andmargin
options accept values labeled with units. Unlabeled values are treated as pixels.A few examples:
page.pdf({'width': 100})
: prints with width set to 100 pixels.page.pdf({'width': '100px'})
: prints with width set to 100 pixels.page.pdf({'width': '10cm'})
: prints with width set to 100 centimeters.
All available units are:
px
: pixelin
: inchcm
: centimetermm
: millimeter
The format options are:
Letter
: 8.5in x 11inLegal
: 8.5in x 14inTabloid
: 11in x 17inLedger
: 17in x 11inA0
: 33.1in x 46.8inA1
: 23.4in x 33.1inA2
: 16.5in x 23.4inA3
: 11.7in x 16.5inA4
: 8.27in x 11.7inA5
: 5.83in x 8.27inA6
: 4.13in x 5.83in
Note
headerTemplate
andfooterTemplate
markup have the following limitations:Script tags inside templates are not evaluated.
Page styles are not visible inside templates.
- coroutine queryObjects(prototypeHandle: JSHandle) JSHandle [source]¶
Iterate js heap and finds all the objects with the handle.
- Parameters:
prototypeHandle (JSHandle) – JSHandle of prototype object.
- coroutine querySelector(selector: str) Optional[ElementHandle] [source]¶
Get an Element which matches
selector
.- Parameters:
selector (str) – A selector to search element.
- Return Optional[ElementHandle]:
If element which matches the
selector
is found, return itsElementHandle
. If not found, returnsNone
.
- coroutine querySelectorAll(selector: str) List[ElementHandle] [source]¶
Get all element which matches
selector
as a list.- Parameters:
selector (str) – A selector to search element.
- Return List[ElementHandle]:
List of
ElementHandle
which matches theselector
. If no element is matched to theselector
, return empty list.
- coroutine querySelectorAllEval(selector: str, pageFunction: str, *args: Any) Any [source]¶
Execute function with all elements which matches
selector
.- Parameters:
selector (str) – A selector to query page for.
pageFunction (str) – String of JavaScript function to be evaluated on browser. This function takes Array of the matched elements as the first argument.
args (Any) – Arguments to pass to
pageFunction
.
- coroutine querySelectorEval(selector: str, pageFunction: str, *args: Any) Any [source]¶
Execute function with an element which matches
selector
.- Parameters:
selector (str) – A selector to query page for.
pageFunction (str) – String of JavaScript function to be evaluated on browser. This function takes an element which matches the selector as a first argument.
args (Any) – Arguments to pass to
pageFunction
.
This method raises error if no element matched the
selector
.
- coroutine reload(options: dict = None, **kwargs: Any) Optional[Response] [source]¶
Reload this page.
Available options are same as
goto()
method.
- coroutine screenshot(options: dict = None, **kwargs: Any) Union[bytes, str] [source]¶
Take a screen shot.
The following options are available:
path
(str): The file path to save the image to. The screenshot type will be inferred from the file extension.type
(str): Specify screenshot type, can be eitherjpeg
orpng
. Defaults topng
.quality
(int): The quality of the image, between 0-100. Not applicable topng
image.fullPage
(bool): When true, take a screenshot of the full scrollable page. Defaults toFalse
.clip
(dict): An object which specifies clipping region of the page. This option should have the following fields:x
(int): x-coordinate of top-left corner of clip area.y
(int): y-coordinate of top-left corner of clip area.width
(int): width of clipping area.height
(int): height of clipping area.
omitBackground
(bool): Hide default white background and allow capturing screenshot with transparency.encoding
(str): The encoding of the image, can be either'base64'
or'binary'
. Defaults to'binary'
.
- coroutine select(selector: str, *values: str) List[str] [source]¶
Select options and return selected values.
If no element matched the
selector
, raiseElementHandleError
.
- coroutine setBypassCSP(enabled: bool) None [source]¶
Toggles bypassing page’s Content-Security-Policy.
Note
CSP bypassing happens at the moment of CSP initialization rather then evaluation. Usually this means that
page.setBypassCSP
should be called before navigating to the domain.
- coroutine setCacheEnabled(enabled: bool = True) None [source]¶
Enable/Disable cache for each request.
By default, caching is enabled.
- coroutine setContent(html: str) None [source]¶
Set content to this page.
- Parameters:
html (str) – HTML markup to assign to the page.
- coroutine setCookie(*cookies: dict) None [source]¶
Set cookies.
cookies
should be dictionaries which contain these fields:name
(str): requiredvalue
(str): requiredurl
(str)domain
(str)path
(str)expires
(number): Unix time in secondshttpOnly
(bool)secure
(bool)sameSite
(str):'Strict'
or'Lax'
Change the default maximum navigation timeout.
This method changes the default timeout of 30 seconds for the following methods:
- Parameters:
timeout (int) – Maximum navigation time in milliseconds. Pass
0
to disable timeout.
- coroutine setExtraHTTPHeaders(headers: Dict[str, str]) None [source]¶
Set extra HTTP headers.
The extra HTTP headers will be sent with every request the page initiates.
Note
page.setExtraHTTPHeaders
does not guarantee the order of headers in the outgoing requests.- Parameters:
headers (Dict) – A dictionary containing additional http headers to be sent with every requests. All header values must be string.
- coroutine setRequestInterception(value: bool) None [source]¶
Enable/disable request interception.
Activating request interception enables
Request
class’sabort()
,continue_()
, andresponse()
methods. This provides the capability to modify network requests that are made by a page.Once request interception is enabled, every request will stall unless it’s continued, responded or aborted.
An example of a native request interceptor that aborts all image requests:
browser = await launch() page = await browser.newPage() await page.setRequestInterception(True) async def intercept(request): if request.url.endswith('.png') or request.url.endswith('.jpg'): await request.abort() else: await request.continue_() page.on('request', lambda req: asyncio.ensure_future(intercept(req))) await page.goto('https://example.com') await browser.close()
- coroutine setUserAgent(userAgent: str) None [source]¶
Set user agent to use in this page.
- Parameters:
userAgent (str) – Specific user agent to use in this page
- coroutine setViewport(viewport: dict) None [source]¶
Set viewport.
- Available options are:
width
(int): page width in pixel.height
(int): page height in pixel.deviceScaleFactor
(float): Default to 1.0.isMobile
(bool): Default toFalse
.hasTouch
(bool): Default toFalse
.isLandscape
(bool): Default toFalse
.
- coroutine tap(selector: str) None [source]¶
Tap the element which matches the
selector
.- Parameters:
selector (str) – A selector to search element to touch.
- property touchscreen: Touchscreen¶
Get
Touchscreen
object.
- coroutine type(selector: str, text: str, options: dict = None, **kwargs: Any) None [source]¶
Type
text
on the element which matchesselector
.If no element matched the
selector
, raisePageError
.Details see
pyppeteer.input.Keyboard.type()
.
- property url: str¶
Get URL of this page.
- property viewport: Optional[Dict]¶
Get viewport as a dictionary or None.
Fields of returned dictionary is same as
setViewport()
.
- waitFor(selectorOrFunctionOrTimeout: Union[str, int, float], options: dict = None, *args: Any, **kwargs: Any) Awaitable [source]¶
Wait for function, timeout, or element which matches on page.
This method behaves differently with respect to the first argument:
If
selectorOrFunctionOrTimeout
is number (int or float), then it is treated as a timeout in milliseconds and this returns future which will be done after the timeout.If
selectorOrFunctionOrTimeout
is a string of JavaScript function, this method is a shortcut towaitForFunction()
.If
selectorOrFunctionOrTimeout
is a selector string or xpath string, this method is a shortcut towaitForSelector()
orwaitForXPath()
. If the string starts with//
, the string is treated as xpath.
Pyppeteer tries to automatically detect function or selector, but sometimes miss-detects. If not work as you expected, use
waitForFunction()
orwaitForSelector()
directly.- Parameters:
selectorOrFunctionOrTimeout – A selector, xpath, or function string, or timeout (milliseconds).
args (Any) – Arguments to pass the function.
- Returns:
Return awaitable object which resolves to a JSHandle of the success value.
Available options: see
waitForFunction()
orwaitForSelector()
- waitForFunction(pageFunction: str, options: dict = None, *args: str, **kwargs: Any) Awaitable [source]¶
Wait until the function completes and returns a truthy value.
- Parameters:
args (Any) – Arguments to pass to
pageFunction
.- Returns:
Return awaitable object which resolves when the
pageFunction
returns a truthy value. It resolves to aJSHandle
of the truthy value.
This method accepts the following options:
polling
(str|number): An interval at which thepageFunction
is executed, defaults toraf
. Ifpolling
is a number, then it is treated as an interval in milliseconds at which the function would be executed. Ifpolling
is a string, then it can be one of the following values:raf
: to constantly executepageFunction
inrequestAnimationFrame
callback. This is the tightest polling mode which is suitable to observe styling changes.mutation
: to executepageFunction
on every DOM mutation.
timeout
(int|float): maximum time to wait for in milliseconds. Defaults to 30000 (30 seconds). Pass0
to disable timeout.
Wait for navigation.
Available options are same as
goto()
method.This returns
Response
when the page navigates to a new URL or reloads. It is useful for when you run code which will indirectly cause the page to navigate. In case of navigation to a different anchor or navigation due to History API usage, the navigation will returnNone
.Consider this example:
navigationPromise = async.ensure_future(page.waitForNavigation()) await page.click('a.my-link') # indirectly cause a navigation await navigationPromise # wait until navigation finishes
or,
await asyncio.wait([ page.click('a.my-link'), page.waitForNavigation(), ])
Note
Usage of the History API to change the URL is considered a navigation.
- coroutine waitForRequest(urlOrPredicate: Union[str, Callable[[Request], bool]], options: Dict = None, **kwargs: Any) Request [source]¶
Wait for request.
- Parameters:
urlOrPredicate – A URL or function to wait for.
This method accepts below options:
timeout
(int|float): Maximum wait time in milliseconds, defaults to 30 seconds, pass0
to disable the timeout.
Example:
firstRequest = await page.waitForRequest('http://example.com/resource') finalRequest = await page.waitForRequest(lambda req: req.url == 'http://example.com' and req.method == 'GET') return firstRequest.url
- coroutine waitForResponse(urlOrPredicate: Union[str, Callable[[Response], bool]], options: Dict = None, **kwargs: Any) Response [source]¶
Wait for response.
- Parameters:
urlOrPredicate – A URL or function to wait for.
This method accepts below options:
timeout
(int|float): Maximum wait time in milliseconds, defaults to 30 seconds, pass0
to disable the timeout.
Example:
firstResponse = await page.waitForResponse('http://example.com/resource') finalResponse = await page.waitForResponse(lambda res: res.url == 'http://example.com' and res.status == 200) return finalResponse.ok
- waitForSelector(selector: str, options: dict = None, **kwargs: Any) Awaitable [source]¶
Wait until element which matches
selector
appears on page.Wait for the
selector
to appear in page. If at the moment of calling the method theselector
already exists, the method will return immediately. If the selector doesn’t appear after thetimeout
milliseconds of waiting, the function will raise error.- Parameters:
selector (str) – A selector of an element to wait for.
- Returns:
Return awaitable object which resolves when element specified by selector string is added to DOM.
This method accepts the following options:
visible
(bool): Wait for element to be present in DOM and to be visible; i.e. to not havedisplay: none
orvisibility: hidden
CSS properties. Defaults toFalse
.hidden
(bool): Wait for element to not be found in the DOM or to be hidden, i.e. havedisplay: none
orvisibility: hidden
CSS properties. Defaults toFalse
.timeout
(int|float): Maximum time to wait for in milliseconds. Defaults to 30000 (30 seconds). Pass0
to disable timeout.
- waitForXPath(xpath: str, options: dict = None, **kwargs: Any) Awaitable [source]¶
Wait until element which matches
xpath
appears on page.Wait for the
xpath
to appear in page. If the moment of calling the method thexpath
already exists, the method will return immediately. If the xpath doesn’t appear aftertimeout
milliseconds of waiting, the function will raise exception.- Parameters:
xpath (str) – A [xpath] of an element to wait for.
- Returns:
Return awaitable object which resolves when element specified by xpath string is added to DOM.
Available options are:
visible
(bool): wait for element to be present in DOM and to be visible, i.e. to not havedisplay: none
orvisibility: hidden
CSS properties. Defaults toFalse
.hidden
(bool): wait for element to not be found in the DOM or to be hidden, i.e. havedisplay: none
orvisibility: hidden
CSS properties. Defaults toFalse
.timeout
(int|float): maximum time to wait for in milliseconds. Defaults to 30000 (30 seconds). Pass0
to disable timeout.
- coroutine xpath(expression: str) List[ElementHandle] [source]¶
Evaluate the XPath expression.
If there are no such elements in this page, return an empty list.
- Parameters:
expression (str) – XPath string to be evaluated.
Worker Class¶
- class pyppeteer.worker.Worker(client: CDPSession, url: str, consoleAPICalled: Callable[[str, List[JSHandle]], None], exceptionThrown: Callable[[Dict], None])[source]¶
The Worker class represents a WebWorker.
The events
workercreated
andworkerdestroyed
are emitted on the page object to signal the worker lifecycle.page.on('workercreated', lambda worker: print('Worker created:', worker.url))
- coroutine evaluate(pageFunction: str, *args: Any) Any [source]¶
Evaluate
pageFunction
withargs
.Shortcut for
(await worker.executionContext).evaluate(pageFunction, *args)
.
- coroutine evaluateHandle(pageFunction: str, *args: Any) JSHandle [source]¶
Evaluate
pageFunction
withargs
and returnJSHandle
.Shortcut for
(await worker.executionContext).evaluateHandle(pageFunction, *args)
.
- coroutine executionContext() ExecutionContext [source]¶
Return ExecutionContext.
- property url: str¶
Return URL.
Keyboard Class¶
- class pyppeteer.input.Keyboard(client: CDPSession)[source]¶
Keyboard class provides as api for managing a virtual keyboard.
The high level api is
type()
, which takes raw characters and generate proper keydown, keypress/input, and keyup events on your page.For finer control, you can use
down()
,up()
, andsendCharacter()
to manually fire events as if they were generated from a real keyboard.An example of holding down
Shift
in order to select and delete some text:await page.keyboard.type('Hello, World!') await page.keyboard.press('ArrowLeft') await page.keyboard.down('Shift') for i in ' World': await page.keyboard.press('ArrowLeft') await page.keyboard.up('Shift') await page.keyboard.press('Backspace') # Result text will end up saying 'Hello!'.
An example of pressing
A
:await page.keyboard.down('Shift') await page.keyboard.press('KeyA') await page.keyboard.up('Shift')
- coroutine down(key: str, options: dict = None, **kwargs: Any) None [source]¶
Dispatch a
keydown
event withkey
.If
key
is a single character and no modifier keys besidesShift
are being held down, and akeypress
/input
event will also generated. Thetext
option can be specified to force aninput
event to be generated.If
key
is a modifier key, likeShift
,Meta
, orAlt
, subsequent key presses will be sent with that modifier active. To release the modifier key, useup()
method.- Parameters:
key (str) – Name of key to press, such as
ArrowLeft
.options (dict) – Option can have
text
field, and if this option specified, generate an input event with this text.
Note
Modifier keys DO influence
down()
. Holding downshift
will type the text in upper case.
- coroutine press(key: str, options: Dict = None, **kwargs: Any) None [source]¶
Press
key
.If
key
is a single character and no modifier keys besidesShift
are being held down, akeypress
/input
event will also generated. Thetext
option can be specified to force an input event to be generated.- Parameters:
key (str) – Name of key to press, such as
ArrowLeft
.
This method accepts the following options:
text
(str): If specified, generates an input event with this text.delay
(int|float): Time to wait betweenkeydown
andkeyup
. Defaults to 0.
Note
Modifier keys DO effect
press()
. Holding downShift
will type the text in upper case.
- coroutine sendCharacter(char: str) None [source]¶
Send character into the page.
This method dispatches a
keypress
andinput
event. This does not send akeydown
orkeyup
event.Note
Modifier keys DO NOT effect
sendCharacter()
. Holding downshift
will not type the text in upper case.
- coroutine type(text: str, options: Dict = None, **kwargs: Any) None [source]¶
Type characters into a focused element.
This method sends
keydown
,keypress
/input
, andkeyup
event for each character in thetext
.To press a special key, like
Control
orArrowDown
, usepress()
method.- Parameters:
text (str) – Text to type into a focused element.
options (dict) – Options can have
delay
(int|float) field, which specifies time to wait between key presses in milliseconds. Defaults to 0.
Note
Modifier keys DO NOT effect
type()
. Holding downshift
will not type the text in upper case.
Mouse Class¶
- class pyppeteer.input.Mouse(client: CDPSession, keyboard: Keyboard)[source]¶
Mouse class.
The
Mouse
operates in main-frame CSS pixels relative to the top-left corner of the viewport.- coroutine click(x: float, y: float, options: dict = None, **kwargs: Any) None [source]¶
Click button at (
x
,y
).Shortcut to
move()
,down()
, andup()
.This method accepts the following options:
button
(str):left
,right
, ormiddle
, defaults toleft
.clickCount
(int): defaults to 1.delay
(int|float): Time to wait betweenmousedown
andmouseup
in milliseconds. Defaults to 0.
- coroutine down(options: dict = None, **kwargs: Any) None [source]¶
Press down button (dispatches
mousedown
event).This method accepts the following options:
button
(str):left
,right
, ormiddle
, defaults toleft
.clickCount
(int): defaults to 1.
Tracing Class¶
- class pyppeteer.tracing.Tracing(client: CDPSession)[source]¶
Tracing class.
You can use
start()
andstop()
to create a trace file which can be opened in Chrome DevTools or timeline viewer.await page.tracing.start({'path': 'trace.json'}) await page.goto('https://www.google.com') await page.tracing.stop()
- coroutine start(options: dict = None, **kwargs: Any) None [source]¶
Start tracing.
Only one trace can be active at a time per browser.
This method accepts the following options:
path
(str): A path to write the trace file to.screenshots
(bool): Capture screenshots in the trace.categories
(List[str]): Specify custom categories to use instead of default.
Dialog Class¶
- class pyppeteer.dialog.Dialog(client: CDPSession, type: str, message: str, defaultValue: str = '')[source]¶
Dialog class.
Dialog objects are dispatched by page via the
dialog
event.An example of using
Dialog
class:browser = await launch() page = await browser.newPage() async def close_dialog(dialog): print(dialog.message) await dialog.dismiss() await browser.close() page.on( 'dialog', lambda dialog: asyncio.ensure_future(close_dialog(dialog)) ) await page.evaluate('() => alert("1")')
- coroutine accept(promptText: str = '') None [source]¶
Accept the dialog.
promptText
(str): A text to enter in prompt. If the dialog’s type is not prompt, this does not cause any effect.
- property defaultValue: str¶
If dialog is prompt, get default prompt value.
If dialog is not prompt, return empty string (
''
).
- property message: str¶
Get dialog message.
- property type: str¶
Get dialog type.
One of
alert
,beforeunload
,confirm
, orprompt
.
ConsoleMessage Class¶
- class pyppeteer.page.ConsoleMessage(type: str, text: str, args: List[JSHandle] = None)[source]¶
Console message class.
ConsoleMessage objects are dispatched by page via the
console
event.- property text: str¶
Return text representation of this message.
- property type: str¶
Return type of this message.
Frame Class¶
- class pyppeteer.frame_manager.Frame(client: CDPSession, parentFrame: Optional[Frame], frameId: str)[source]¶
Frame class.
Frame objects can be obtained via
pyppeteer.page.Page.mainFrame
.- coroutine J(selector: str) Optional[ElementHandle] ¶
Alias to
querySelector()
- coroutine JJ(selector: str) List[ElementHandle] ¶
Alias to
querySelectorAll()
- coroutine JJeval(selector: str, pageFunction: str, *args: Any) Optional[Dict] ¶
Alias to
querySelectorAllEval()
- coroutine Jeval(selector: str, pageFunction: str, *args: Any) Any ¶
Alias to
querySelectorEval()
- coroutine Jx(expression: str) List[ElementHandle] ¶
Alias to
xpath()
- coroutine addScriptTag(options: Dict) ElementHandle [source]¶
Add script tag to this frame.
Details see
pyppeteer.page.Page.addScriptTag()
.
- coroutine addStyleTag(options: Dict) ElementHandle [source]¶
Add style tag to this frame.
Details see
pyppeteer.page.Page.addStyleTag()
.
- coroutine click(selector: str, options: dict = None, **kwargs: Any) None [source]¶
Click element which matches
selector
.Details see
pyppeteer.page.Page.click()
.
- coroutine evaluate(pageFunction: str, *args: Any, force_expr: bool = False) Any [source]¶
Evaluate pageFunction on this frame.
Details see
pyppeteer.page.Page.evaluate()
.
- coroutine evaluateHandle(pageFunction: str, *args: Any) JSHandle [source]¶
Execute function on this frame.
Details see
pyppeteer.page.Page.evaluateHandle()
.
- coroutine executionContext() Optional[ExecutionContext] [source]¶
Return execution context of this frame.
Return
ExecutionContext
associated to this frame.
- coroutine focus(selector: str) None [source]¶
Focus element which matches
selector
.Details see
pyppeteer.page.Page.focus()
.
- coroutine hover(selector: str) None [source]¶
Mouse hover the element which matches
selector
.Details see
pyppeteer.page.Page.hover()
.
- property name: str¶
Get frame name.
- property parentFrame: Optional[Frame]¶
Get parent frame.
If this frame is main frame or detached frame, return
None
.
- coroutine querySelector(selector: str) Optional[ElementHandle] [source]¶
Get element which matches
selector
string.Details see
pyppeteer.page.Page.querySelector()
.
- coroutine querySelectorAll(selector: str) List[ElementHandle] [source]¶
Get all elements which matches
selector
.Details see
pyppeteer.page.Page.querySelectorAll()
.
- coroutine querySelectorAllEval(selector: str, pageFunction: str, *args: Any) Optional[Dict] [source]¶
Execute function on all elements which matches selector.
Details see
pyppeteer.page.Page.querySelectorAllEval()
.
- coroutine querySelectorEval(selector: str, pageFunction: str, *args: Any) Any [source]¶
Execute function on element which matches selector.
Details see
pyppeteer.page.Page.querySelectorEval()
.
- coroutine select(selector: str, *values: str) List[str] [source]¶
Select options and return selected values.
Details see
pyppeteer.page.Page.select()
.
- coroutine tap(selector: str) None [source]¶
Tap the element which matches the
selector
.Details see
pyppeteer.page.Page.tap()
.
- coroutine type(selector: str, text: str, options: dict = None, **kwargs: Any) None [source]¶
Type
text
on the element which matchesselector
.Details see
pyppeteer.page.Page.type()
.
- property url: str¶
Get url of the frame.
- waitFor(selectorOrFunctionOrTimeout: Union[str, int, float], options: dict = None, *args: Any, **kwargs: Any) Union[Awaitable, WaitTask] [source]¶
Wait until
selectorOrFunctionOrTimeout
.Details see
pyppeteer.page.Page.waitFor()
.
- waitForFunction(pageFunction: str, options: dict = None, *args: Any, **kwargs: Any) WaitTask [source]¶
Wait until the function completes.
Details see
pyppeteer.page.Page.waitForFunction()
.
- waitForSelector(selector: str, options: dict = None, **kwargs: Any) WaitTask [source]¶
Wait until element which matches
selector
appears on page.Details see
pyppeteer.page.Page.waitForSelector()
.
- waitForXPath(xpath: str, options: dict = None, **kwargs: Any) WaitTask [source]¶
Wait until element which matches
xpath
appears on page.Details see
pyppeteer.page.Page.waitForXPath()
.
- coroutine xpath(expression: str) List[ElementHandle] [source]¶
Evaluate the XPath expression.
If there are no such elements in this frame, return an empty list.
- Parameters:
expression (str) – XPath string to be evaluated.
ExecutionContext Class¶
- class pyppeteer.execution_context.ExecutionContext(client: CDPSession, contextPayload: Dict, objectHandleFactory: Any, frame: Frame = None)[source]¶
Execution Context class.
- coroutine evaluate(pageFunction: str, *args: Any, force_expr: bool = False) Any [source]¶
Execute
pageFunction
on this context.Details see
pyppeteer.page.Page.evaluate()
.
- coroutine evaluateHandle(pageFunction: str, *args: Any, force_expr: bool = False) JSHandle [source]¶
Execute
pageFunction
on this context.Details see
pyppeteer.page.Page.evaluateHandle()
.
- coroutine queryObjects(prototypeHandle: JSHandle) JSHandle [source]¶
Send query.
Details see
pyppeteer.page.Page.queryObjects()
.
JSHandle Class¶
- class pyppeteer.execution_context.JSHandle(context: ExecutionContext, client: CDPSession, remoteObject: Dict)[source]¶
JSHandle class.
JSHandle represents an in-page JavaScript object. JSHandle can be created with the
evaluateHandle()
method.- asElement() Optional[ElementHandle] [source]¶
Return either null or the object handle itself.
- property executionContext: ExecutionContext¶
Get execution context of this handle.
ElementHandle Class¶
- class pyppeteer.element_handle.ElementHandle(context: ExecutionContext, client: CDPSession, remoteObject: dict, page: Any, frameManager: FrameManager)[source]¶
ElementHandle class.
This class represents an in-page DOM element. ElementHandle can be created by the
pyppeteer.page.Page.querySelector()
method.ElementHandle prevents DOM element from garbage collection unless the handle is disposed. ElementHandles are automatically disposed when their origin frame gets navigated.
ElementHandle isinstance can be used as arguments in
pyppeteer.page.Page.querySelectorEval()
andpyppeteer.page.Page.evaluate()
methods.- coroutine J(selector: str) Optional[ElementHandle] ¶
alias to
querySelector()
- coroutine JJ(selector: str) List[ElementHandle] ¶
alias to
querySelectorAll()
- coroutine JJeval(selector: str, pageFunction: str, *args: Any) Any ¶
alias to
querySelectorAllEval()
- coroutine Jeval(selector: str, pageFunction: str, *args: Any) Any ¶
alias to
querySelectorEval()
- coroutine Jx(expression: str) List[ElementHandle] ¶
alias to
xpath()
- asElement() ElementHandle [source]¶
Return this ElementHandle.
- coroutine boundingBox() Optional[Dict[str, float]] [source]¶
Return bounding box of this element.
If the element is not visible, return
None
.This method returns dictionary of bounding box, which contains:
x
(int): The X coordinate of the element in pixels.y
(int): The Y coordinate of the element in pixels.width
(int): The width of the element in pixels.height
(int): The height of the element in pixels.
- coroutine boxModel() Optional[Dict] [source]¶
Return boxes of element.
Return
None
if element is not visible. Boxes are represented as an list of points; each Point is a dictionary{x, y}
. Box points are sorted clock-wise.Returned value is a dictionary with the following fields:
content
(List[Dict]): Content box.padding
(List[Dict]): Padding box.border
(List[Dict]): Border box.margin
(List[Dict]): Margin box.width
(int): Element’s width.height
(int): Element’s height.
- coroutine click(options: dict = None, **kwargs: Any) None [source]¶
Click the center of this element.
If needed, this method scrolls element into view. If the element is detached from DOM, the method raises
ElementHandleError
.options
can contain the following fields:button
(str):left
,right
, ofmiddle
, defaults toleft
.clickCount
(int): Defaults to 1.delay
(int|float): Time to wait betweenmousedown
andmouseup
in milliseconds. Defaults to 0.
- coroutine contentFrame() Optional[Frame] [source]¶
Return the content frame for the element handle.
Return
None
if this handle is not referencing iframe.
- coroutine hover() None [source]¶
Move mouse over to center of this element.
If needed, this method scrolls element into view. If this element is detached from DOM tree, the method raises an
ElementHandleError
.
- coroutine isIntersectingViewport() bool [source]¶
Return
True
if the element is visible in the viewport.
- coroutine press(key: str, options: Dict = None, **kwargs: Any) None [source]¶
Press
key
onto the element.This method focuses the element, and then uses
pyppeteer.input.keyboard.down()
andpyppeteer.input.keyboard.up()
.- Parameters:
key (str) – Name of key to press, such as
ArrowLeft
.
This method accepts the following options:
text
(str): If specified, generates an input event with this text.delay
(int|float): Time to wait betweenkeydown
andkeyup
. Defaults to 0.
- coroutine querySelector(selector: str) Optional[ElementHandle] [source]¶
Return first element which matches
selector
under this element.If no element matches the
selector
, returnsNone
.
- coroutine querySelectorAll(selector: str) List[ElementHandle] [source]¶
Return all elements which match
selector
under this element.If no element matches the
selector
, returns empty list ([]
).
- coroutine querySelectorAllEval(selector: str, pageFunction: str, *args: Any) Any [source]¶
Run
Page.querySelectorAllEval
within the element.This method runs
Array.from(document.querySelectorAll)
within the element and passes it as the first argument topageFunction
. If there is no element matchingselector
, the method raisesElementHandleError
.If
pageFunction
returns a promise, then wait for the promise to resolve and return its value.Example:
<div class="feed"> <div class="tweet">Hello!</div> <div class="tweet">Hi!</div> </div>
feedHandle = await page.J('.feed') assert (await feedHandle.JJeval('.tweet', '(nodes => nodes.map(n => n.innerText))')) == ['Hello!', 'Hi!']
- coroutine querySelectorEval(selector: str, pageFunction: str, *args: Any) Any [source]¶
Run
Page.querySelectorEval
within the element.This method runs
document.querySelector
within the element and passes it as the first argument topageFunction
. If there is no element matchingselector
, the method raisesElementHandleError
.If
pageFunction
returns a promise, then wait for the promise to resolve and return its value.ElementHandle.Jeval
is a shortcut of this method.Example:
tweetHandle = await page.querySelector('.tweet') assert (await tweetHandle.querySelectorEval('.like', 'node => node.innerText')) == 100 assert (await tweetHandle.Jeval('.retweets', 'node => node.innerText')) == 10
- coroutine screenshot(options: Dict = None, **kwargs: Any) bytes [source]¶
Take a screenshot of this element.
If the element is detached from DOM, this method raises an
ElementHandleError
.Available options are same as
pyppeteer.page.Page.screenshot()
.
- coroutine tap() None [source]¶
Tap the center of this element.
If needed, this method scrolls element into view. If the element is detached from DOM, the method raises
ElementHandleError
.
- coroutine type(text: str, options: Dict = None, **kwargs: Any) None [source]¶
Focus the element and then type text.
Details see
pyppeteer.input.Keyboard.type()
method.
- coroutine xpath(expression: str) List[ElementHandle] [source]¶
Evaluate the XPath expression relative to this elementHandle.
If there are no such elements, return an empty list.
- Parameters:
expression (str) – XPath string to be evaluated.
Request Class¶
- class pyppeteer.network_manager.Request(client: CDPSession, requestId: Optional[str], interceptionId: Optional[str], isNavigationRequest: bool, allowInterception: bool, url: str, resourceType: str, payload: dict, frame: Optional[Frame], redirectChain: List[Request])[source]¶
Request class.
Whenever the page sends a request, such as for a network resource, the following events are emitted by pyppeteer’s page:
'request'
: emitted when the request is issued by the page.'response'
: emitted when/if the response is received for the request.'requestfinished'
: emitted when the response body is downloaded and the request is complete.
If request fails at some point, then instead of
'requestfinished'
event (and possibly instead of'response'
event), the'requestfailed'
event is emitted.If request gets a
'redirect'
response, the request is successfully finished with the'requestfinished'
event, and a new request is issued to a redirect url.- coroutine abort(errorCode: str = 'failed') None [source]¶
Abort request.
To use this, request interception should be enabled by
pyppeteer.page.Page.setRequestInterception()
. If request interception is not enabled, raiseNetworkError
.errorCode
is an optional error code string. Defaults tofailed
, could be one of the following:aborted
: An operation was aborted (due to user action).accessdenied
: Permission to access a resource, other than the network, was denied.addressunreachable
: The IP address is unreachable. This usually means that there is no route to the specified host or network.blockedbyclient
: The client chose to block the request.blockedbyresponse
: The request failed because the request was delivered along with requirements which are not met (‘X-Frame-Options’ and ‘Content-Security-Policy’ ancestor check, for instance).connectionaborted
: A connection timeout as a result of not receiving an ACK for data sent.connectionclosed
: A connection was closed (corresponding to a TCP FIN).connectionfailed
: A connection attempt failed.connectionrefused
: A connection attempt was refused.connectionreset
: A connection was reset (corresponding to a TCP RST).internetdisconnected
: The Internet connection has been lost.namenotresolved
: The host name could not be resolved.timedout
: An operation timed out.failed
: A generic failure occurred.
- coroutine continue_(overrides: Dict = None) None [source]¶
Continue request with optional request overrides.
To use this method, request interception should be enabled by
pyppeteer.page.Page.setRequestInterception()
. If request interception is not enabled, raiseNetworkError
.overrides
can have the following fields:url
(str): If set, the request url will be changed.method
(str): If set, change the request method (e.g.GET
).postData
(str): If set, change the post data or request.headers
(dict): If set, change the request HTTP header.
- failure() Optional[Dict] [source]¶
Return error text.
Return
None
unless this request was failed, as reported byrequestfailed
event.When request failed, this method return dictionary which has a
errorText
field, which contains human-readable error message, e.g.'net::ERR_RAILED'
.
- property frame: Optional[Frame]¶
Return a matching
frame
object.Return
None
if navigating to error page.
- property headers: Dict¶
Return a dictionary of HTTP headers of this request.
All header names are lower-case.
Whether this request is driving frame’s navigation.
- property method: Optional[str]¶
Return this request’s method (GET, POST, etc.).
- property postData: Optional[str]¶
Return post body of this request.
- property redirectChain: List[Request]¶
Return chain of requests initiated to fetch a resource.
If there are no redirects and request was successful, the chain will be empty.
If a server responds with at least a single redirect, then the chain will contain all the requests that were redirected.
redirectChain
is shared between all the requests of the same chain.
- property resourceType: str¶
Resource type of this request perceived by the rendering engine.
ResourceType will be one of the following:
document
,stylesheet
,image
,media
,font
,script
,texttrack
,xhr
,fetch
,eventsource
,websocket
,manifest
,other
.
- coroutine respond(response: Dict) None [source]¶
Fulfills request with given response.
To use this, request interception should by enabled by
pyppeteer.page.Page.setRequestInterception()
. Request interception is not enabled, raiseNetworkError
.response
is a dictionary which can have the following fields:status
(int): Response status code, defaults to 200.headers
(dict): Optional response headers.contentType
(str): If set, equals to settingContent-Type
response header.body
(str|bytes): Optional response body.
- property response: Optional[Response]¶
Return matching
Response
object, orNone
.If the response has not been received, return
None
.
- property url: str¶
URL of this request.
Response Class¶
- class pyppeteer.network_manager.Response(client: CDPSession, request: Request, status: int, headers: Dict[str, str], fromDiskCache: bool, fromServiceWorker: bool, securityDetails: Dict = None)[source]¶
Response class represents responses which are received by
Page
.- property fromCache: bool¶
Return
True
if the response was served from cache.Here
cache
is either the browser’s disk cache or memory cache.
- property fromServiceWorker: bool¶
Return
True
if the response was served by a service worker.
- property headers: Dict¶
Return dictionary of HTTP headers of this response.
All header names are lower-case.
- property ok: bool¶
Return bool whether this request is successful (200-299) or not.
- property securityDetails: Union[Dict, SecurityDetails]¶
Return security details associated with this response.
Security details if the response was received over the secure connection, or
None
otherwise.
- property status: int¶
Status code of the response.
- property url: str¶
URL of the response.
Target Class¶
- class pyppeteer.target.Target(targetInfo: Dict, browserContext: BrowserContext, sessionFactory: Callable[[], Coroutine[Any, Any, CDPSession]], ignoreHTTPSErrors: bool, defaultViewport: Optional[Dict], screenshotTaskQueue: List, loop: AbstractEventLoop)[source]¶
Browser’s target class.
- property browserContext: BrowserContext¶
Return the browser context the target belongs to.
- coroutine createCDPSession() CDPSession [source]¶
Create a Chrome Devtools Protocol session attached to the target.
- property opener: Optional[Target]¶
Get the target that opened this target.
Top-level targets return
None
.
- coroutine page() Optional[Page] [source]¶
Get page of this target.
If the target is not of type “page” or “background_page”, return
None
.
- property type: str¶
Get type of this target.
Type can be
'page'
,'background_page'
,'service_worker'
,'browser'
, or'other'
.
- property url: str¶
Get url of this target.
CDPSession Class¶
- class pyppeteer.connection.CDPSession(connection: Union[Connection, CDPSession], targetType: str, sessionId: str, loop: AbstractEventLoop)[source]¶
Chrome Devtools Protocol Session.
The
CDPSession
instances are used to talk raw Chrome Devtools Protocol:protocol methods can be called with
send()
method.protocol events can be subscribed to with
on()
method.
Documentation on DevTools Protocol can be found here.
Coverage Class¶
- class pyppeteer.coverage.Coverage(client: CDPSession)[source]¶
Coverage class.
Coverage gathers information about parts of JavaScript and CSS that were used by the page.
An example of using JavaScript and CSS coverage to get percentage of initially executed code:
# Enable both JavaScript and CSS coverage await page.coverage.startJSCoverage() await page.coverage.startCSSCoverage() # Navigate to page await page.goto('https://example.com') # Disable JS and CSS coverage and get results jsCoverage = await page.coverage.stopJSCoverage() cssCoverage = await page.coverage.stopCSSCoverage() totalBytes = 0 usedBytes = 0 coverage = jsCoverage + cssCoverage for entry in coverage: totalBytes += len(entry['text']) for range in entry['ranges']: usedBytes += range['end'] - range['start'] - 1 print('Bytes used: {}%'.format(usedBytes / totalBytes * 100))
- coroutine startCSSCoverage(options: Dict = None, **kwargs: Any) None [source]¶
Start CSS coverage measurement.
Available options are:
resetOnNavigation
(bool): Whether to reset coverage on every navigation. Defaults toTrue
.
- coroutine startJSCoverage(options: Dict = None, **kwargs: Any) None [source]¶
Start JS coverage measurement.
Available options are:
resetOnNavigation
(bool): Whether to reset coverage on every navigation. Defaults toTrue
.reportAnonymousScript
(bool): Whether anonymous script generated by the page should be reported. Defaults toFalse
.
Note
Anonymous scripts are ones that don’t have an associated url. These are scripts that are dynamically created on the page using
eval
ofnew Function
. IfreportAnonymousScript
is set toTrue
, anonymous scripts will have__pyppeteer_evaluation_script__
as their url.
- coroutine stopCSSCoverage() List [source]¶
Stop CSS coverage measurement and get result.
Return list of coverage reports for all non-anonymous scripts. Each report includes:
url
(str): StyleSheet url.text
(str): StyleSheet content.ranges
(List[Dict]): StyleSheet ranges that were executed. Ranges are sorted and non-overlapping.start
(int): A start offset in text, inclusive.end
(int): An end offset in text, exclusive.
Note
CSS coverage doesn’t include dynamically injected style tags without sourceURLs (but currently includes… to be fixed).
- coroutine stopJSCoverage() List [source]¶
Stop JS coverage measurement and get result.
Return list of coverage reports for all scripts. Each report includes:
url
(str): Script url.text
(str): Script content.ranges
(List[Dict]): Script ranges that were executed. Ranges are sorted and non-overlapping.start
(int): A start offset in text, inclusive.end
(int): An end offset in text, exclusive.
Note
JavaScript coverage doesn’t include anonymous scripts by default. However, scripts with sourceURLs are reported.
Debugging¶
For debugging, you can set logLevel
option to logging.DEBUG
for
pyppeteer.launcher.launch()
and pyppeteer.launcher.connect()
functions. However, this option prints too many logs including SEND/RECV
messages of pyppeteer. In order to only show suppressed error messages, you
should set pyppeteer.DEBUG
to True
.
Example:
import asyncio
import pyppeteer
from pyppeteer import launch
pyppeteer.DEBUG = True # print suppressed errors as error log
async def main():
browser = await launch()
... # do something
asyncio.get_event_loop().run_until_complete(main())