# Data Collection

### Purpose

Use this module to retrieve raw data and documents so they can be processed and analyzed downstream.

### Key functions

* `ExtractTextFromPDF` — Extract and clean text from PDF documents
* `FetchPDFFromURL` — Download PDF files from URLs
* `FetchUSShapefile` — Retrieve geographical shapefiles from the U.S. Census Bureau TIGER database
* `FetchWebsiteText` — Scrape text content from websites
* `GetCompanyFilings` — Access SEC EDGAR company filings
* `GetGoogleSearchResults` — Fetch Google search results via Serper API
* `GetZipFile` — Download and extract ZIP files from URLs

### Common use cases

* Web scraping
* Document processing
* Geospatial analysis
* Market research and competitive intelligence
