HG Data Query
Execute read-only SQL SELECT queries against the HG Insights data warehouse with billions of rows of technographic, intent, and spend data.
Data Coverage
| Metric | Value | Source |
|---|---|---|
| Warehouse rows | Billions | Phoenix MCP tools overview: "data warehouse with billions of rows of technographic, intent, and spend data". Verified 2026-05-11. |
Run read-only SQL against the HG Insights data warehouse.
Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
query | string | Yes | - | Read-only SQL query to execute. Must start with SELECT or WITH. |
max_rows | number | No | 1000 | Maximum number of rows to return (1-10,000). |
Required Integrations
hginsights_v2
Credit Cost
Credits are dynamic — 1 credit per row returned by the query. The actual cost is reported by the upstream HG Insights API in the credits_consumed field of each response.
- A query returning 100 rows costs 100 credits
- A query returning 0 rows costs 0 credits
hg_catalogis free (0 credits)
Available Tables
Use hg_catalog to discover the full schema. Key tables include:
| Table | Approximate Rows | Description |
|---|---|---|
company_locations | ~30M | Company lookup and firmographics |
install_global | ~296M | Current product installations |
install_time_series_global | ~27B | Historical verification events |
install_intensity_by_month_global | ~3.9B | Monthly intensity scores |
install_intensity_momentum_global | ~102M | Usage acceleration |
install_intensity_trending_global | ~81M | Usage trending |
intent_global | ~7.3B | Buyer intent signals |
spend_absolute | - | IT spend data |
Most tables can be joined to company_locations via the url_id column.
Security Guardrails
- SELECT only: Queries must start with
SELECTorWITH. DDL/DML keywords (INSERT,UPDATE,DELETE,DROP,CREATE,ALTER, etc.) are blocked. - Single statement: Semicolons are rejected to prevent multi-statement payloads.
- Timeout: Queries that do not complete within 30 seconds are aborted.
- Row cap: Maximum 10,000 rows per query (enforced via
max_rowsparameter). - The upstream API may impose additional restrictions beyond client-side validation.
Use Cases
- Custom analytics across the full HG data warehouse
- Cross-table joins not available through individual MCP tools
- Aggregation queries (COUNT, SUM, AVG) across large datasets
- Time-series analysis of technology adoption trends
- Building custom company lists with complex filtering criteria
- Total Addressable Market (TAM) / market sizing — see below
TAM / Market-Sizing
hg_data_query is the way to answer TAM-shaped questions about a specific product or vendor. The customer-data query layer already supports the underlying capability; the flow is anchored on the customer's own product/vendor (TAM is "who could buy my product?") and routes through get_vendor_information to resolve product IDs and their categoryIds in a single call.
There are three TAM shapes:
installed_base— companies that already use the anchor product. "How many mid-market NA companies installed Splunk?"competitive_displacement— companies on a different vendor in the same category as the anchor. "Companies running a SIEM that's not Splunk."whitespace— companies in the category with no vendor at all. "Mid-market companies with no SIEM yet."
The flow:
- Resolve the anchor via
get_vendor_information(vendorName: "[product]"). Capture the product IDs and the distinctcategoryIdvalues across those products. The categoryIds define the implied competitive set, so there is no need to enumerate every competitor's product manually. - Discover the warehouse tables via
hg_catalog(free, 0 credits). Identify the install table (install_global), the firmographic table (company_locations), and the join key (url_id). - Compose the SQL using one of the three patterns below.
- Execute via
hg_data_queryand present the result.
Pattern A — installed_base
SELECT COUNT(DISTINCT cl.company_id) AS company_count
FROM install_global ig
JOIN company_locations cl USING (url_id)
WHERE ig.product_id IN (:anchorProductIds)
AND cl.country IN ('US', 'CA') AND cl.employees BETWEEN 100 AND 1000;
Pattern B — competitive_displacement
The category_id filter avoids any need for a category-filtered vendor lookup.
SELECT DISTINCT cl.company_id, cl.company_name, cl.country
FROM install_global ig
JOIN company_locations cl USING (url_id)
WHERE ig.category_id IN (:anchorCategoryIds)
AND ig.product_id NOT IN (:anchorProductIds)
AND cl.country IN ('US', 'CA')
LIMIT 1000;
Pattern C — whitespace
SELECT cl.company_id, cl.company_name
FROM company_locations cl
WHERE cl.country IN ('US', 'CA') AND cl.employees BETWEEN 100 AND 1000
AND cl.url_id NOT IN (
SELECT url_id FROM install_global WHERE category_id IN (:anchorCategoryIds)
)
LIMIT 1000;
Example Usage
Basic company lookup
{
"tool": "hg_data_query",
"parameters": {
"query": "SELECT url_id, company_name, country, employee_count FROM company_locations WHERE company_name ILIKE '%Salesforce%' LIMIT 10"
}
}
Find top technologies at a company
{
"tool": "hg_data_query",
"parameters": {
"query": "SELECT ig.product_name, ig.vendor_name, ig.intensity FROM install_global ig JOIN company_locations cl ON ig.url_id = cl.url_id WHERE cl.company_name ILIKE '%Cisco%' ORDER BY ig.intensity DESC LIMIT 20"
}
}
Aggregate intent signals by topic
{
"tool": "hg_data_query",
"parameters": {
"query": "SELECT topic, COUNT(*) as signal_count FROM intent_global WHERE url_id IN (SELECT url_id FROM company_locations WHERE country = 'US' AND employee_count > 1000) GROUP BY topic ORDER BY signal_count DESC LIMIT 10",
"max_rows": 10
}
}
Monthly intensity trend
{
"tool": "hg_data_query",
"parameters": {
"query": "SELECT month, avg_intensity FROM install_intensity_by_month_global WHERE url_id = 12345 AND product_name = 'Snowflake' ORDER BY month DESC LIMIT 12"
}
}
Response Format
{
"rows": [
{ "url_id": 12345, "company_name": "Acme Corp", "country": "US" }
],
"columns": ["url_id", "company_name", "country"],
"row_count": 1,
"credits_consumed": 1
}
rows: Array of row objects with column names as keyscolumns: Array of column names returned by the queryrow_count: Number of rows returnedcredits_consumed: Actual credits consumed (dynamic, reported by upstream API)
Error Handling
Common errors:
- Invalid SQL: Query rejected if it contains disallowed keywords or doesn't start with SELECT/WITH
- Query rejected by warehouse: The upstream API returns details about SQL syntax errors or invalid table/column references
- Timeout: Query did not complete within 30 seconds
- Authentication: HG Insights API key not configured or invalid
- Rate limit: API rate limit exceeded
Best Practices
- Always call
hg_catalogfirst to discover available tables and columns before writing queries - Use LIMIT to control row count and credit consumption
- Join via
url_id— this is the primary key connecting most tables tocompany_locations - Start small — test queries with
LIMIT 10before running larger queries - Use specific columns instead of selecting all columns to improve query performance
Related Tools
hg_catalog— Discover available tables and columns (free, call before writing queries)company_technographic— Pre-built technology lookup for a single companycompany_firmographic— Company firmographic datacompany_intent— Intent signals for a single company