LogoLogo
Visit Contextual.ioSign Up
  • Getting Started
    • Welcome
    • Tour: Hello, AI World!
  • TRAINING
    • Basic Developer Training Course
      • Lesson 1: HTTP Agent Introduction
      • Lesson 2: Logging and Error Handling Basics
      • Lesson 3: Event Processing Agent Introduction
  • Services Catalog
    • What's in the Catalog?
      • Intro Patterns
      • Object Type Bundles
    • Browse by Platform
    • All Intro Patterns
      • Anthropic Claude Image Analysis
      • Mistral AI Prompt and Response
      • xAI Grok Prompt and Response
      • DeepSeek Chat Prompt and Response
      • Qwen Chat Prompt and Response
      • Perplexity AI Search and Response
      • Firecrawl Website Scraper
      • Groq Prompt and Response
      • Nyckel Dog Breed Classification
      • RapidAPI ClassifyAI Text Classification
      • RapidAPI YouTube AI Video Summary
      • UnifyAI Model Comparison
      • WebPilot URL Analysis and Summarization
      • OpenAI Assistants Prompt and Response
      • OpenAI Sync
    • All Prebuilt Solutions
      • Invoice AI
      • Lead Generation Form
    • All Object Type Bundles
      • Work Order Management System ITIL Object Type Bundle
        • Work Order
        • User
        • Role
        • Permission
        • Asset
        • Task
        • Action
        • Attachment
        • Comment
        • Notification
        • Audit Log
        • Service Level Agreement
        • Custom Fields
        • Work Order Template
        • Work Order Transition
        • Escalation Policy
        • Tag
  • Components & Data
    • Object Types
      • Data in Contextual
        • Secrets
        • Validation
        • Versioning
      • Examples
      • Creating an Object Type
      • Object Type Details
        • Definition
        • Data Schema
          • Automatic Record Metadata
          • Generated Values
            • Dates and Times
            • UUIDs
          • Frequently Used Validation
          • Disallowing Null Property Values
          • Disallowing Undefined Properties
          • Secrets
          • AI Assistant
          • ID and PrimaryKey Permanence
        • UI Schemas
        • Features
        • Triggers
        • Actions
        • Audit Trail
        • Versions
        • Templates
        • Records
      • Using Object Types in Flows
      • Records
        • Records and Your Tenant API
        • Record Import
    • Flows
      • Nodes
      • Wires
      • Message Object
      • Flow Editor
        • Basics
        • Saving Changes
        • In-Flow Testing with Debugger
        • Restart Agents to Make Changes Active
        • Config
      • Node Reference
        • Common
          • Log Tap
          • Inject
          • Debug
          • Complete
          • Catch
          • Status
          • Link In
          • Link Call
          • Link Out
          • Comment
        • Event
          • Prepare Event
          • Event Start
          • Event End
          • Event Error
        • Object
          • Search Object
          • Get Object
          • Create Object
          • Patch Object
          • Put Object
          • Delete Object
          • Run Action
        • Request
          • Send to Agent
          • HTTP GET
          • HTTP PATCH
          • HTTP PUT
          • HTTP DELETE
          • HTTP POST
          • GQL
          • Produce Message
        • Function
          • Function
          • Switch
          • Change
          • Range
          • Template
          • Delay
          • Trigger
          • Exec
          • Filter
          • Loop
        • Models
          • ML Predict
        • Network
          • MQTT In
          • MQTT Out
          • HTTP In
          • HTTP Response
          • HTTP Request
          • WebSocket In
          • WebSocket Out
          • TCP In
          • TCP Out
          • TCP Request
          • UDP In
          • UDP Out
        • Sequence
          • Split
          • Join
          • Sort
          • Batch
        • Parser
          • csv
          • html
          • json
          • xml
          • yaml
    • Agents
      • Creating an Agent
      • Types of Agents
        • Event to Flow
        • HTTP to Flow
          • Custom Domains
      • How Agents Work
        • Flow Execution
        • HTTP Load Balancing
        • Event Routing
      • Scale and Performance
        • Flow execution
        • Parallel Instances
        • Event Lag Scaling
        • Compute Threshold Scaling
        • Instance Compute Sizing
      • Agent Details
        • Definition
        • Operations
        • Logs
          • Session Log
          • Message Log
        • Audit Trail
        • Versions
      • Using Agents in Flows
    • Connections
      • Creating a Connection
      • Types of Connections
        • Basic
        • Bearer
        • Client Grant
        • Kafka
        • Password Grant
        • Public
        • Pulsar
      • Using Connections in Flows
    • JWKS Profiles
      • Using JWKS Profiles in Your Solution
  • PATTERNS
    • Solution Architecture
      • Events, Messages, Queues
    • Working with Data
      • Search Object Node & Pagination
      • Message Payload Content - Triggers and Actions
    • Industry Cookbooks
      • Field Services
  • Tenants
    • Tenant Workspace
    • Tenant Logs
      • Contextual Log Query Language (CLQL)
        • String Searches
        • Keyword Searches
        • Advanced Operators
    • Tenant API
      • API Keys
        • API Key Settings
        • API Key Permissions
      • Documentation
  • Release Notes
    • 2024
      • 2024.12.09
Powered by GitBook
On this page

Was this helpful?

  1. Services Catalog
  2. All Intro Patterns

Firecrawl Website Scraper

Firecrawl Website Scraper Flow is designed to send a website URL to the Firecrawl API, retrieve the website’s content in a specified format (e.g., markdown), and store the results for further analysis. This flow is perfect for automating website scraping tasks, allowing you to capture and structure website content efficiently. It can be particularly useful for applications such as research automation, competitive analysis, and content aggregation.

You can find this template in the Services Catalog under these categories:

  • Contextual Basics, Enrichment

What's Included:

  • 1 Flow

  • 1 Object Type

  • 1 Connection

What You'll Need:

  • Access to the Firecrawl API

  • API Key for the Firecrawl service

Ideas for Using the Firecrawl Website Scraper Flow:

  • Research Automation: Use this flow to automate website scraping on specific topics such as market trends, competitor analysis, or product reviews.

  • Content Aggregation: Quickly extract and organize website content by sending various URLs to the Firecrawl API and capturing structured markdown data.

  • Data Enrichment: Implement this flow to enhance internal datasets with additional information scraped from relevant websites.


Flow Overview

Flow Start

The flow begins by injecting a test URL, which can be modified to suit specific scraping needs.

Send URL and Receive Response

The flow sends the URL to the Firecrawl API. The API processes the URL and returns a response, which is then logged and passed on for further formatting.

Format Response & Create Record

The response from the Firecrawl API is structured into a record format that includes the website title, description, and content. The scraped data is then stored in the system.

Error Handling

Any errors encountered during the flow are captured and logged for troubleshooting, ensuring that issues can be quickly identified and resolved.

Flow End

The flow concludes once the records have been successfully created or any errors have been logged.


Firecrawl Website Scraper Flow Details

Inbound Send to Agent Events

  • Nodes: contextual-start

  • Purpose: The flow begins by receiving a start signal, typically initiated by an external event or agent.

In-Editor Testing

  • Nodes: Test URL, Prepare Scrape

  • Purpose: Allows for testing the flow directly within the editor. The URL is prepared and passed to the Firecrawl API for processing.

Code Example: Prepare Scrape Function

// Prepare the payload for Firecrawl API request
msg.payload = {
    url: msg.payload.url, // URL to be scraped, set in the Inject node above
    formats: ["markdown"] // Specify the output format for the scraped data
};
return msg;

Explanation: This function constructs the payload for the Firecrawl API, specifying the URL to scrape and the desired output format. The payload is then passed to the next node in the flow, where it will be sent to the API.

Send URL and Receive Response

  • Nodes: Send to Firecrawl, Firecrawl Response

  • Purpose: The prepared URL is sent to the Firecrawl API. The response is logged and passed on for further formatting and processing.

Code Example: Firecrawl Response Log

// Log the response from the Firecrawl API
msg.payload.response = {
    title: msg.payload.response.data.metadata.title,
    description: msg.payload.response.data.metadata.description,
    content: msg.payload.response.data.markdown
};
return msg;

Explanation: This function logs the response received from the Firecrawl API, capturing the scraped website data along with metadata for further processing and storage.

Format Response & Create Record

  • Nodes: Prepare Record Data, Create Scraped Data Record, Create Scraped Data Record Log

  • Purpose: The scraped data is formatted into a structured record and stored in the system, including key metadata and the scraped website content.

Code Example: Prepare Record Data Function

// Prepare data for Create Object node and assign to msg.payload
let body = {
    title: msg.payload.response.title,
    description: msg.payload.response.description,
    content: msg.payload.response.content,
    url: msg.payload.url // Original URL that was scraped
};

// Assign prepared data to msg.payload for use in Create Object node
msg.payload = body;
return msg;

Explanation: This function formats the API response into a structured object, which includes the website title, description, content, and original URL. This data is then ready to be stored as a record.

Error Handling

  • Nodes: catch, Error Catch Log, contextual-error

  • Purpose: Catches any errors that occur during the flow and logs them for review, ensuring that issues can be identified and resolved.

Flow End

  • Nodes: contextual-end

  • Purpose: The flow completes its process, either after successfully creating records or after logging any errors that occurred.


Summary of Flow:

  • Flow Start: Initiate the flow with a test URL.

  • Data Preparation: Prepare the URL for interaction with the Firecrawl API.

  • API Interaction: Send the URL to the Firecrawl API and log the response.

  • Record Creation: Format and store the scraped website content as a record for analysis.

  • Error Handling: Capture and log any errors that occur during the process.

  • Flow End: Conclude the flow after records are created or errors are logged.

PreviousPerplexity AI Search and ResponseNextGroq Prompt and Response

Last updated 6 months ago

Was this helpful?

Page cover image