HG Data Query

Execute read-only SQL SELECT queries against the HG Insights data warehouse with billions of rows of technographic, intent, and spend data.

Data Coverage

Metric	Value	Source
Warehouse rows	Billions	Phoenix MCP tools overview: "data warehouse with billions of rows of technographic, intent, and spend data". Verified 2026-05-11.

Run read-only SQL against the HG Insights data warehouse.

Parameters

Name	Type	Required	Default	Description
`query`	string	Yes	-	Read-only SQL query to execute. Must start with `SELECT` or `WITH`.
`max_rows`	number	No	`1000`	Maximum number of rows to return (1-10,000).

Required Integrations

hginsights_v2

Credit Cost

Credits are dynamic — 1 credit per row returned by the query. The actual cost is reported by the upstream HG Insights API in the credits_consumed field of each response.

A query returning 100 rows costs 100 credits
A query returning 0 rows costs 0 credits
hg_catalog is free (0 credits)

Available Tables

Use hg_catalog to discover the full schema. Key tables include:

Table	Approximate Rows	Description
`company_locations`	~30M	Company lookup and firmographics
`install_global`	~296M	Current product installations
`install_time_series_global`	~27B	Historical verification events
`install_intensity_by_month_global`	~3.9B	Monthly intensity scores
`install_intensity_momentum_global`	~102M	Usage acceleration
`install_intensity_trending_global`	~81M	Usage trending
`intent_global`	~7.3B	Buyer intent signals
`spend_absolute`	-	IT spend data

Most tables can be joined to company_locations via the url_id column.

Security Guardrails

SELECT only: Queries must start with SELECT or WITH. DDL/DML keywords (INSERT, UPDATE, DELETE, DROP, CREATE, ALTER, etc.) are blocked.
Single statement: Semicolons are rejected to prevent multi-statement payloads.
Timeout: Queries that do not complete within 30 seconds are aborted.
Row cap: Maximum 10,000 rows per query (enforced via max_rows parameter).
The upstream API may impose additional restrictions beyond client-side validation.

Use Cases

Custom analytics across the full HG data warehouse
Cross-table joins not available through individual MCP tools
Aggregation queries (COUNT, SUM, AVG) across large datasets
Time-series analysis of technology adoption trends
Building custom company lists with complex filtering criteria
Total Addressable Market (TAM) / market sizing — see below

TAM / Market-Sizing

hg_data_query is the way to answer TAM-shaped questions about a specific product or vendor. The customer-data query layer already supports the underlying capability; the flow is anchored on the customer's own product/vendor (TAM is "who could buy my product?") and routes through get_vendor_information to resolve product IDs and their categoryIds in a single call.

There are three TAM shapes:

installed_base — companies that already use the anchor product. "How many mid-market NA companies installed Splunk?"
competitive_displacement — companies on a different vendor in the same category as the anchor. "Companies running a SIEM that's not Splunk."
whitespace — companies in the category with no vendor at all. "Mid-market companies with no SIEM yet."

The flow:

Resolve the anchor via get_vendor_information(vendorName: "[product]"). Capture the product IDs and the distinct categoryId values across those products. The categoryIds define the implied competitive set, so there is no need to enumerate every competitor's product manually.
Discover the warehouse tables via hg_catalog (free, 0 credits). Identify the install table (install_global), the firmographic table (company_locations), and the join key (url_id).
Compose the SQL using one of the three patterns below.
Execute via hg_data_query and present the result.

Pattern A — installed_base

SELECT COUNT(DISTINCT cl.company_id) AS company_count
FROM install_global ig
JOIN company_locations cl USING (url_id)
WHERE ig.product_id IN (:anchorProductIds)
  AND cl.country IN ('US', 'CA') AND cl.employees BETWEEN 100 AND 1000;

Pattern B — competitive_displacement

The category_id filter avoids any need for a category-filtered vendor lookup.

SELECT DISTINCT cl.company_id, cl.company_name, cl.country
FROM install_global ig
JOIN company_locations cl USING (url_id)
WHERE ig.category_id IN (:anchorCategoryIds)
  AND ig.product_id NOT IN (:anchorProductIds)
  AND cl.country IN ('US', 'CA')
LIMIT 1000;

Pattern C — whitespace

SELECT cl.company_id, cl.company_name
FROM company_locations cl
WHERE cl.country IN ('US', 'CA') AND cl.employees BETWEEN 100 AND 1000
  AND cl.url_id NOT IN (
    SELECT url_id FROM install_global WHERE category_id IN (:anchorCategoryIds)
  )
LIMIT 1000;

Example Usage

Basic company lookup

{
  "tool": "hg_data_query",
  "parameters": {
    "query": "SELECT url_id, company_name, country, employee_count FROM company_locations WHERE company_name ILIKE '%Salesforce%' LIMIT 10"
  }
}

Find top technologies at a company

{
  "tool": "hg_data_query",
  "parameters": {
    "query": "SELECT ig.product_name, ig.vendor_name, ig.intensity FROM install_global ig JOIN company_locations cl ON ig.url_id = cl.url_id WHERE cl.company_name ILIKE '%Cisco%' ORDER BY ig.intensity DESC LIMIT 20"
  }
}

Aggregate intent signals by topic

{
  "tool": "hg_data_query",
  "parameters": {
    "query": "SELECT topic, COUNT(*) as signal_count FROM intent_global WHERE url_id IN (SELECT url_id FROM company_locations WHERE country = 'US' AND employee_count > 1000) GROUP BY topic ORDER BY signal_count DESC LIMIT 10",
    "max_rows": 10
  }
}

Monthly intensity trend

{
  "tool": "hg_data_query",
  "parameters": {
    "query": "SELECT month, avg_intensity FROM install_intensity_by_month_global WHERE url_id = 12345 AND product_name = 'Snowflake' ORDER BY month DESC LIMIT 12"
  }
}

Response Format

{
  "rows": [
    { "url_id": 12345, "company_name": "Acme Corp", "country": "US" }
  ],
  "columns": ["url_id", "company_name", "country"],
  "row_count": 1,
  "credits_consumed": 1
}

rows: Array of row objects with column names as keys
columns: Array of column names returned by the query
row_count: Number of rows returned
credits_consumed: Actual credits consumed (dynamic, reported by upstream API)

Error Handling

Common errors:

Invalid SQL: Query rejected if it contains disallowed keywords or doesn't start with SELECT/WITH
Query rejected by warehouse: The upstream API returns details about SQL syntax errors or invalid table/column references
Timeout: Query did not complete within 30 seconds
Authentication: HG Insights API key not configured or invalid
Rate limit: API rate limit exceeded

Best Practices

Always call hg_catalog first to discover available tables and columns before writing queries
Use LIMIT to control row count and credit consumption
Join via url_id — this is the primary key connecting most tables to company_locations
Start small — test queries with LIMIT 10 before running larger queries
Use specific columns instead of selecting all columns to improve query performance

hg_catalog — Discover available tables and columns (free, call before writing queries)
company_technographic — Pre-built technology lookup for a single company
company_firmographic — Company firmographic data
company_intent — Intent signals for a single company

Data Coverage​

Parameters​

Required Integrations​

Credit Cost​

Available Tables​

Security Guardrails​

Use Cases​

TAM / Market-Sizing​

Pattern A — installed_base​

Pattern B — competitive_displacement​

Pattern C — whitespace​

Example Usage​

Basic company lookup​

Find top technologies at a company​

Aggregate intent signals by topic​

Monthly intensity trend​

Response Format​

Error Handling​

Best Practices​

Related Tools​

Next Steps​