Skip to main content

HG Data Query

Execute read-only SQL SELECT queries against the HG Insights data warehouse with billions of rows of technographic, intent, and spend data.

Parameters

NameTypeRequiredDefaultDescription
querystringYes-Read-only SQL query to execute. Must start with SELECT or WITH.
max_rowsnumberNo1000Maximum number of rows to return (1-10,000).

Required Integrations

  • hginsights_v2

Credit Cost

Credits are dynamic1 credit per row returned by the query. The actual cost is reported by the upstream HG Insights API in the credits_consumed field of each response.

  • A query returning 100 rows costs 100 credits
  • A query returning 0 rows costs 0 credits
  • hg_catalog is free (0 credits)

Available Tables

Use hg_catalog to discover the full schema. Key tables include:

TableApproximate RowsDescription
company_locations~30MCompany lookup and firmographics
install_global~296MCurrent product installations
install_time_series_global~27BHistorical verification events
install_intensity_by_month_global~3.9BMonthly intensity scores
install_intensity_momentum_global~102MUsage acceleration
install_intensity_trending_global~81MUsage trending
intent_global~7.3BBuyer intent signals
spend_absolute-IT spend data

Most tables can be joined to company_locations via the url_id column.

Security Guardrails

  • SELECT only: Queries must start with SELECT or WITH. DDL/DML keywords (INSERT, UPDATE, DELETE, DROP, CREATE, ALTER, etc.) are blocked.
  • Single statement: Semicolons are rejected to prevent multi-statement payloads.
  • Timeout: Queries that do not complete within 30 seconds are aborted.
  • Row cap: Maximum 10,000 rows per query (enforced via max_rows parameter).
  • The upstream API may impose additional restrictions beyond client-side validation.

Use Cases

  • Custom analytics across the full HG data warehouse
  • Cross-table joins not available through individual MCP tools
  • Aggregation queries (COUNT, SUM, AVG) across large datasets
  • Time-series analysis of technology adoption trends
  • Building custom company lists with complex filtering criteria

Example Usage

Basic company lookup

{
"tool": "hg_data_query",
"parameters": {
"query": "SELECT url_id, company_name, country, employee_count FROM company_locations WHERE company_name ILIKE '%Salesforce%' LIMIT 10"
}
}

Find top technologies at a company

{
"tool": "hg_data_query",
"parameters": {
"query": "SELECT ig.product_name, ig.vendor_name, ig.intensity FROM install_global ig JOIN company_locations cl ON ig.url_id = cl.url_id WHERE cl.company_name ILIKE '%Cisco%' ORDER BY ig.intensity DESC LIMIT 20"
}
}

Aggregate intent signals by topic

{
"tool": "hg_data_query",
"parameters": {
"query": "SELECT topic, COUNT(*) as signal_count FROM intent_global WHERE url_id IN (SELECT url_id FROM company_locations WHERE country = 'US' AND employee_count > 1000) GROUP BY topic ORDER BY signal_count DESC LIMIT 10",
"max_rows": 10
}
}

Monthly intensity trend

{
"tool": "hg_data_query",
"parameters": {
"query": "SELECT month, avg_intensity FROM install_intensity_by_month_global WHERE url_id = 12345 AND product_name = 'Snowflake' ORDER BY month DESC LIMIT 12"
}
}

Response Format

{
"rows": [
{ "url_id": 12345, "company_name": "Acme Corp", "country": "US" }
],
"columns": ["url_id", "company_name", "country"],
"row_count": 1,
"credits_consumed": 1
}
  • rows: Array of row objects with column names as keys
  • columns: Array of column names returned by the query
  • row_count: Number of rows returned
  • credits_consumed: Actual credits consumed (dynamic, reported by upstream API)

Error Handling

Common errors:

  • Invalid SQL: Query rejected if it contains disallowed keywords or doesn't start with SELECT/WITH
  • Query rejected by warehouse: The upstream API returns details about SQL syntax errors or invalid table/column references
  • Timeout: Query did not complete within 30 seconds
  • Authentication: HG Insights API key not configured or invalid
  • Rate limit: API rate limit exceeded

Best Practices

  1. Always call hg_catalog first to discover available tables and columns before writing queries
  2. Use LIMIT to control row count and credit consumption
  3. Join via url_id — this is the primary key connecting most tables to company_locations
  4. Start small — test queries with LIMIT 10 before running larger queries
  5. Use specific columns instead of selecting all columns to improve query performance

Next Steps