Skip to Content

Bright Data

Service domainWEB SCRAPING
Bright Data icon
CommunityBYOC

Search, Crawl and Scrape any site, at scale, without getting blocked

Author:Arcade
Version:0.5.0
Auth:No authentication required
3tools
3require secrets
PyPI VersionPython VersionsWheel StatusDownloadsLicense

Bright Data provides a developer toolkit for large-scale web search, crawling, and scraping, enabling reliable extraction of pages and structured data without getting blocked. It supports search queries, content-to-Markdown conversion, and configurable data feeds across many site types.

Designed for integration into data pipelines and analytics workflows with parameterized feeds and output formats.

Capabilities

  • Scale-resistant crawling and scraping with anti-blocking behavior for sustained collection.
  • Flexible search engine queries with advanced parameters across major engines.
  • Transform pages into clean Markdown and emit structured JSON feeds for profiles, products, reviews, listings, and media.
  • Configurable extraction parameters for batching, pagination, and media handling.

Secrets

  • API key (BRIGHTDATA_API_KEY) and zone token (BRIGHTDATA_ZONE). Example values: BRIGHTDATA_API_KEY=sk_..., BRIGHTDATA_ZONE=zone123.

Available tools(3)

3 of 3 tools
Operations
Behavior
Tool nameDescriptionSecrets
Scrape a webpage and return content in Markdown format using Bright Data. Examples: scrape_as_markdown("https://example.com") -> "# Example Page Content..." scrape_as_markdown("https://news.ycombinator.com") -> "# Hacker News ..."
2
Search using Google, Bing, or Yandex with advanced parameters using Bright Data. Examples: search_engine("climate change") -> "# Search Results ## Climate Change - Wikipedia ..." search_engine("Python tutorials", engine="bing", num_results=5) -> "# Bing Results ..." search_engine("cats", search_type="images", country_code="us") -> "# Image Results ..."
2
Extract structured data from various websites like LinkedIn, Amazon, Instagram, etc. NEVER MADE UP LINKS - IF LINKS ARE NEEDED, EXECUTE search_engine FIRST. Supported source types: - amazon_product, amazon_product_reviews - linkedin_person_profile, linkedin_company_profile - zoominfo_company_profile - instagram_profiles, instagram_posts, instagram_reels, instagram_comments - facebook_posts, facebook_marketplace_listings, facebook_company_reviews - x_posts - zillow_properties_listing - booking_hotel_listings - youtube_videos Examples: web_data_feed("amazon_product", "https://amazon.com/dp/B08N5WRWNW") -> "{"title": "Product Name", ...}" web_data_feed("linkedin_person_profile", "https://linkedin.com/in/johndoe") -> "{"name": "John Doe", ...}" web_data_feed( "facebook_company_reviews", "https://facebook.com/company", num_of_reviews=50 ) -> "[{"review": "...", ...}]"
1

Selected tools

No tools selected.

Click "Show all tools" to add tools.

Requirements

Select tools to see requirements

#

Brightdata.ScrapeAsMarkdown

Execution hints

Signals for MCP clients and agents about how this tool behaves.

Operations
Read
Service domains
Web scraping
MCP behavior
Read only
Yes

Does not modify remote state.

Destructive
No

May delete or overwrite remote data.

Idempotent
Yes

Safe to retry without extra side effects.

Open world
Yes

Can call out to external systems.

Scrape a webpage and return content in Markdown format using Bright Data. Examples: scrape_as_markdown("https://example.com") -> "# Example Page Content..." scrape_as_markdown("https://news.ycombinator.com") -> "# Hacker News ..."

Parameters

ParameterTypeReq.Description
urlstringRequiredURL to scrape

Requirements

Secrets:BRIGHTDATA_API_KEYBRIGHTDATA_ZONE

Output

Type:stringScraped webpage content as Markdown
#

Brightdata.SearchEngine

Execution hints

Signals for MCP clients and agents about how this tool behaves.

Operations
Read
Service domains
Web scraping
MCP behavior
Read only
Yes

Does not modify remote state.

Destructive
No

May delete or overwrite remote data.

Idempotent
Yes

Safe to retry without extra side effects.

Open world
Yes

Can call out to external systems.

Search using Google, Bing, or Yandex with advanced parameters using Bright Data. Examples: search_engine("climate change") -> "# Search Results ## Climate Change - Wikipedia ..." search_engine("Python tutorials", engine="bing", num_results=5) -> "# Bing Results ..." search_engine("cats", search_type="images", country_code="us") -> "# Image Results ..."

Parameters

ParameterTypeReq.Description
querystringRequiredSearch query
enginestringOptionalSearch engine to use
googlebingyandex
languagestringOptionalTwo-letter language code
country_codestringOptionalTwo-letter country code
search_typestringOptionalType of search
imagesshoppingnewsjobs
startintegerOptionalResults pagination offset
num_resultsintegerOptionalNumber of results to return. The default is 10
locationstringOptionalLocation for search results
devicestringOptionalDevice type
mobileiosiphoneipadandroidandroid_tablet
return_jsonbooleanOptionalReturn JSON instead of Markdown

Requirements

Secrets:BRIGHTDATA_API_KEYBRIGHTDATA_ZONE

Output

Type:stringSearch results as Markdown or JSON
#

Brightdata.WebDataFeed

Execution hints

Signals for MCP clients and agents about how this tool behaves.

Operations
Read
Service domains
Web scraping
MCP behavior
Read only
Yes

Does not modify remote state.

Destructive
No

May delete or overwrite remote data.

Idempotent
No

Safe to retry without extra side effects.

Open world
Yes

Can call out to external systems.

Extract structured data from various websites like LinkedIn, Amazon, Instagram, etc. NEVER MADE UP LINKS - IF LINKS ARE NEEDED, EXECUTE search_engine FIRST. Supported source types: - amazon_product, amazon_product_reviews - linkedin_person_profile, linkedin_company_profile - zoominfo_company_profile - instagram_profiles, instagram_posts, instagram_reels, instagram_comments - facebook_posts, facebook_marketplace_listings, facebook_company_reviews - x_posts - zillow_properties_listing - booking_hotel_listings - youtube_videos Examples: web_data_feed("amazon_product", "https://amazon.com/dp/B08N5WRWNW") -> "{"title": "Product Name", ...}" web_data_feed("linkedin_person_profile", "https://linkedin.com/in/johndoe") -> "{"name": "John Doe", ...}" web_data_feed( "facebook_company_reviews", "https://facebook.com/company", num_of_reviews=50 ) -> "[{"review": "...", ...}]"

Parameters

ParameterTypeReq.Description
source_typestringRequiredType of data source
amazon_productamazon_product_reviewslinkedin_person_profilelinkedin_company_profilezoominfo_company_profileinstagram_profilesinstagram_postsinstagram_reelsinstagram_commentsfacebook_postsfacebook_marketplace_listingsfacebook_company_reviewsx_postszillow_properties_listingbooking_hotel_listingsyoutube_videos
urlstringRequiredURL of the web resource to extract data from
num_of_reviewsintegerOptionalNumber of reviews to retrieve. Only applicable for facebook_company_reviews. Default is None
timeoutintegerOptionalMaximum time in seconds to wait for data retrieval
polling_intervalintegerOptionalTime in seconds between polling attempts

Requirements

Secrets:BRIGHTDATA_API_KEY

Output

Type:stringStructured data from the requested source as JSON
Last updated on