Gather data from external sources such as websites, PDFs, ZIP archives, and APIs.
Use this module to retrieve raw data and documents so they can be processed and analyzed downstream.
ExtractTextFromPDF — Extract and clean text from PDF documents
ExtractTextFromPDF
FetchPDFFromURL — Download PDF files from URLs
FetchPDFFromURL
FetchUSShapefile — Retrieve geographical shapefiles from the U.S. Census Bureau TIGER database
FetchUSShapefile
FetchWebsiteText — Scrape text content from websites
FetchWebsiteText
GetCompanyFilings — Access SEC EDGAR company filings
GetCompanyFilings
GetGoogleSearchResults — Fetch Google search results via Serper API
GetGoogleSearchResults
GetZipFile — Download and extract ZIP files from URLs
GetZipFile
Web scraping
Document processing
Geospatial analysis
Market research and competitive intelligence
Last updated 3 months ago