Skip to content
Chatbotscape

Data Sources

This page catalogs every external data source Chatbotscape uses across its review, scoring, and ratings infrastructure. For each source, we document what it provides, how it's queried, what the refresh cadence is, what cost or rate-limit constraints apply, and what failure modes we've observed. Researchers, journalists, and methodology auditors can use this page to triangulate any Chatbotscape claim against its underlying source.

Primary sources for claim verification

Vendor public pages

The canonical source for any vendor's pricing, features, channels, integrations, certifications, and partnership claims. Every Tier 1 review verifies these claims directly against the vendor's current public pages within 30 days of publication for pricing and 6 months for functional claims.

Vendor page typeUsed forRefresh cadenceCapture method
Pricing pagePricing tiers, contact/seat limits, overage rates, free trial duration, currency variations90 days Tier 1Direct browser visit; monthly-billed toggle activated; URL parameters tested for multi-region
Product/features pagesAI capabilities, NLU language support, RAG/multimodal, BYOLLM claims6 months Tier 1Direct browser visit; feature claims captured against published vendor documentation
Integrations pageNative integration list, partner-platform names, third-party connector availability6 months Tier 1Direct browser visit; each named integration verified by name on vendor's integration directory
Channels pageChannel availability per tier, beta vs production status, BSP routing6 months Tier 1Direct browser visit; per-channel page checked for beta indicators
Help center / docsFeature operational details, configuration depth, language coverage of documentation6 months Tier 1Direct browser visit; language switcher checked for help center localization scope
Trust / compliance / security pagesHIPAA, SOC 2, GDPR, LGPD, PCI-DSS certification claims6 months Tier 1Direct browser visit; certification claims verified on vendor's trust page or compliance documentation
About pageFounders, founding year, headquarters, team size12 months Tier 1Direct browser visit; cross-checked against LinkedIn vendor profile and press releases

Failure modes we've observed and guard against:

  • Vendors A/B testing pricing pages with different cohort-specific pricing → we capture from a clean session without prior cookies
  • Vendors showing default-region currency that differs from the buyer's actual purchase currency → we capture native USD where the vendor supports it, и flag conversion math otherwise
  • Vendors burying tier limits inside expandable accordions or tooltip overlays → we expand all UI elements before capture
  • Vendors using "contact for pricing" framing on lower-tier plans for SMB segments → we capture this as a methodology note, not as missing data

Third-party aggregator pages (G2, Capterra, TrustPilot)

The canonical source for user-voice claims — review counts, star ratings, sub-rating breakdowns, recurring strengths and weaknesses patterns. Used in the User-Feedback-Patterns section of every Tier 1 review.

AggregatorURL patternRefresh cadenceWhat we extract
G2g2.com/products//reviews6 months Tier 1Star rating (overall), review count, sub-rating breakdown (Ease of Use, Features, Customer Service, Likelihood to Recommend), recurring strengths/weaknesses themes
Capterracapterra.com/p///6 months Tier 1Star rating, review count, sub-ratings (Ease of Use, Features, Value for Money, Customer Service), Capterra's vendor-rating breakdown
TrustPilottrustpilot.com/review/6 months Tier 1Star rating, review count, sentiment trend, complaint categories
Reddit (r/)reddit.com/r/Spot check at refreshRecent community discussion themes; supplementary signal only
Product Huntproducthunt.com/products/12 months Tier 1Upvote count, launch context (signal of vendor stage)
Reclame Aqui (Brazil-focused)reclameaqui.com.br/6 months for LATAM-priority reviewsBrazilian consumer complaint patterns; LATAM-priority signal

Failure modes we've observed:

  • G2 review counts in training data inflated 50× over actual (claimed 7,800 vs actual 163) → we always verify against current G2 page
  • TrustPilot ratings often diverge significantly from G2/Capterra (G2 4.6 vs TrustPilot 2.5) → we publish both rather than assuming parity
  • Capterra sub-rating breakdown not all dimensions visible publicly for all platforms → we publish what's visible, flag what's missing
  • Selective excerpting of aggregator reviews to manufacture sentiment → we report dominant signal across at least 6 months, not most flattering or damning excerpts

Ahrefs API

The canonical source for multi-locale brand search volume — the primary input to platform popularity scoring (dimension 15 in the rubric).

EndpointUsed forCostRefresh cadence
keywords-explorer-volume-by-countryPer-country brand search volume across 10 target localesAPI units (Standard plan: 50k units/month, ~10% usage)90 days Tier 1, 30-day cache during active review writing
keywords-explorer-keyword-difficultyKD signal for category and best-of list keyword targetingAPI unitsPre-drafting for new pages; cached 30 days
serp-overviewSERP feature presenceAPI unitsSpot check pre-drafting

Target locales: United States, Brazil, Mexico, Spain, Argentina, Colombia, India, United Kingdom, Germany, France. Multi-locale aggregation matters because US-only ranking severely distorts LATAM priority — Manychat brand vol in LATAM is 3× the US figure, SendPulse is 11× US, Typebot is 32× Brazil vs US, AiSensy is 86× India vs US. We use the volume-by-country endpoint before Tier assignment decisions to avoid US-centric distortion.

Failure modes we guard against:

  • Memory-rule-derived heuristics ("LATAM is typically 11× US for chatbot platforms") used to derive per-country numbers without fresh API query → we query the API fresh for every brand vol claim per Rule 10
  • API caching staleness when brand vol shifts between scheduled refreshes → 30-day cache TTL during active review writing
  • Per-country data missing for low-volume brands → flag as "below measurable threshold" rather than report a zero

Meta Business Partner Directory

The canonical source for WhatsApp Business Solution Provider (BSP) partnership claims. BSP status is one of the 14 critical pros/cons triggers and materially affects WhatsApp template approval timelines (24-48 hours for BSP partners vs 5-7 days for non-partners).

SourceURLRefresh cadenceWhat we verify
Meta Business Partner Directoryfacebook.com/business/partner-directory6 months Tier 1, on-demand if vendor claims BSP status changeCurrent partner listing, partner tier (Verified Partner / Solution Provider / Premier), region availability
Vendor's WhatsApp product page.com/whatsapp6 monthsVendor self-claimed BSP status and "expedited template approval" framing

Verification protocol: We cite the Meta Business Partner Directory listing URL in any review that claims BSP status. Where Meta directory access is intermittent or the vendor's listing has changed tier, we soften prose to "vendor-claimed" with verification-depth disclosure rather than rely on vendor self-claim alone.

Other partner directories

Partner programDirectory URLUsed forRefresh cadence
Google Cloud Partner Directorycloud.google.com/find-a-partnerGoogle Cloud / Marketing partnership claims6 months on-demand
AWS Partner Networkaws.amazon.com/partnersAWS partnership claims6 months on-demand
HubSpot Solutions Directoryecosystem.hubspot.com/marketplace/solutionsHubSpot partnership claims6 months on-demand
Shopify Partner Directorypartners.shopify.com/directoryShopify partnership claims (relevant for ecommerce-bot platforms)6 months on-demand
TikTok for Business Partner Directorytiktok.com/business/marketing-partnersTikTok Marketing Partner status6 months on-demand

Partner-status claims that cannot be verified against the relevant directory are softened to "vendor-claimed" with verification-depth disclosure per Rule 9.

Crunchbase / PitchBook / vendor press releases

The canonical source for funding totals, vendor stability claims, and corporate-relationship facts (acquisitions, board changes, executive moves).

SourceURLUsed forCost
Crunchbasecrunchbase.com/organization/Funding rounds, lead investors, total raised, executive bioFree tier access for basic profile; paid tier for full funding round details
PitchBookpitchbook.com/profiles/company/Detailed funding round data, comparable transactionsPaid tier required
Vendor press releasesVendor news/press sectionRecent funding announcements, partnership announcements, certification announcementsFree
TechCrunch / The Information / Business Insidertechcrunch.com, theinformation.com, businessinsider.comCross-verification of funding announcements, acquisitions, executive movesFree for TechCrunch; paid subscriptions for The Information and Business Insider

Failure modes: Funding round details in training data are typically months or years stale. We verify any funding total claim against the most recent press release or Crunchbase entry, with date stamped.

Hands-on testing observations

First-party data captured during the six-scenario testing protocol. Distinct from vendor or third-party claims — these are measurements we made directly on paid-tier accounts.

ScenarioMeasurementRecording methodRefresh cadence
A — Basic FAQ botTime-to-first-bot in minutes, 20-query intent accuracy %, friction rating 1-5Timer + standardized test query battery + qualitative UX notes6 months Tier 1
B — Lead capture + Sheets syncSetup time, data fidelity %, friction ratingTimer + round-trip test submissions + qualitative UX notes6 months Tier 1
C — WhatsApp commerce flowSetup time, BSP template approval hours, friction ratingTimer + wall-clock template approval tracking + qualitative UX notes6 months Tier 1
D — AI knowledge base multi-languageIntent accuracy %, citation rate %, hallucination rate % per language20-query test battery in EN + ES + PT-BR + market-relevant fourth language6 months Tier 1
E — Human handoverContext-transfer friction 1-5, role-based-access correctness 1-5, multi-user inbox UX 1-5Qualitative UX assessment + role permission test6 months Tier 1
F — Analytics + ad-conversion trackingPer-area score 1-5 (dashboard, funnel builder, CSV export, ad-conversion tracking)Qualitative dashboard audit6 months Tier 1

Test environment: Chrome on macOS, viewport 1440×900 at 2× retina. Primary locale English with secondary tests in Spanish (LATAM), Brazilian Portuguese, and a market-relevant fourth language. Test accounts created via standard public signup flow.

Inference basis for non-hands-on scenarios: Where direct hands-on measurement is not feasible (vendor demo-gates the account, paid tier blocks specific scenario testing, infrastructure access is restricted), scenario outcomes are anchored to vendor positioning, third-party reviewer reports + G2/Capterra recurring themes, comparable-platform measured benchmarks from existing reviews, and current-generation LLM capability patterns. The inference basis for each non-hands-on observation is documented in the review's POC notes sibling file. See /methodology/review-standards Rule 13 for the standardized format.

Internal data infrastructure

DataPurposeStorageAccess
data/market-pricing-data.csvCross-platform pricing dataset for VfM scoring and comparison enginesRepo CSVPublic-facing comparison outputs derive from this; raw CSV is internal infrastructure
Platform DB (Postgres / Neon)Structured platform records, scoring fields, critical-trigger trackerPostgresBackend; surfaces to publishable reviews via component rendering
Image libraryWatermarked, originality-tier-tagged screenshots from hands-on testingCloud storageSurfaces to publishable reviews via image components
Brand vol cache30-day Ahrefs response cache during active review writingCloud storageInternal; refreshed automatically

Source authority and cross-verification

Single-source claims are not sufficient. Every material claim in a Tier 1 review is cross-verified across at least two independent sources where the claim type supports cross-verification. Pricing: vendor pricing page + market-pricing-data CSV. Channels: vendor channels page + vendor pricing page + hands-on observation. AI capabilities: vendor AI product page + vendor help center + hands-on testing observation. Partnership status: vendor self-claim + relevant partner directory.

Vendor self-claim is not sufficient for partnership badges, certifications, or aggregate user metrics. Per Rule 9 of the review hygiene standard, vendor-claimed partnership status without external verification is softened to "vendor-claimed" framing with verification-depth disclosure. Per Rule 6, vendor-claimed aggregate scores (review counts, ratings) are always cross-verified against the relevant aggregator.

Training data is never a valid source. Per Rules 6 and 10, training-data-derived claims (memory, "I recall reading", "typical for vendors in this category") are not acceptable substitutes for primary-source verification. Where primary source is unavailable, the claim is removed or labeled as estimate with methodology disclosure.

Source refresh tracking

Every Chatbotscape page that depends on external sources carries "Last verified" date fields per source type in its frontmatter:

last_verified:
  pricing: 2026-05-24
  features: 2026-05-24
  channels: 2026-05-24
  integrations: 2026-05-24
  certifications: 2026-05-24
  g2_score: 2026-05-25
  capterra_score: 2026-05-25
  trustpilot_score: 2026-05-25
  ahrefs_brand_vol: 2026-05-20
  meta_bsp_directory: 2026-05-25

The page footer surfaces the oldest "Last verified" date as the page's effective freshness signal. Where any date is more than 6 months stale, the page carries a staleness banner pointing the reader to either the next scheduled refresh date or the editorial review process contact form.

Version history of this page

  • 2026-05-26 (v1.0) — Initial publication aligned to methodology v3.12.1. Sources catalog formalized after the Tier 1 anchor review batch demonstrated the need for explicit per-source refresh-cadence documentation and failure-mode disclosure.