langchain.chains.natbot.crawler.Crawler¶

class langchain.chains.natbot.crawler.Crawler[source]¶

A crawler for web pages.

Security Note: This is an implementation of a crawler that uses a browser via

Playwright.

This crawler can be used to load arbitrary webpages INCLUDING content from the local file system.

Control access to who can submit crawling requests and what network access the crawler has.

Make sure to scope permissions to the minimal permissions necessary for the application.

See https://python.langchain.ac.cn/docs/security for more information.

Methods

__init__()

click(id)

crawl()

enter()

go_to_page(url)

scroll(direction)

type(id, text)

__init__() None[source]¶
Return type

None

click(id: Union[str, int]) None[source]¶
Parameters

id (Union[str, int]) –

Return type

None

crawl() List[str][source]¶
Return type

List[str]

enter() None[source]¶
Return type

None

go_to_page(url: str) None[source]¶
Parameters

url (str) –

Return type

None

scroll(direction: str) None[source]¶
Parameters

direction (str) –

Return type

None

type(id: Union[str, int], text: str) None[source]¶
Parameters
  • id (Union[str, int]) –

  • text (str) –

Return type

None