Data Sources

This page catalogs every external data source Chatbotscape uses across its review, scoring, and ratings infrastructure. For each source, we document what it provides, how it's queried, what the refresh cadence is, what cost or rate-limit constraints apply, and what failure modes we've observed. Researchers, journalists, and methodology auditors can use this page to triangulate any Chatbotscape claim against its underlying source.

Primary sources for claim verification

Vendor public pages

The canonical source for any vendor's pricing, features, channels, integrations, certifications, and partnership claims. Every Tier 1 review verifies these claims directly against the vendor's current public pages within 30 days of publication for pricing and 6 months for functional claims.

Vendor page type	Used for	Refresh cadence	Capture method
Pricing page	Pricing tiers, contact/seat limits, overage rates, free trial duration, currency variations	90 days Tier 1	Direct browser visit; monthly-billed toggle activated; URL parameters tested for multi-region
Product/features pages	AI capabilities, NLU language support, RAG/multimodal, BYOLLM claims	6 months Tier 1	Direct browser visit; feature claims captured against published vendor documentation
Integrations page	Native integration list, partner-platform names, third-party connector availability	6 months Tier 1	Direct browser visit; each named integration verified by name on vendor's integration directory
Channels page	Channel availability per tier, beta vs production status, BSP routing	6 months Tier 1	Direct browser visit; per-channel page checked for beta indicators
Help center / docs	Feature operational details, configuration depth, language coverage of documentation	6 months Tier 1	Direct browser visit; language switcher checked for help center localization scope
Trust / compliance / security pages	HIPAA, SOC 2, GDPR, LGPD, PCI-DSS certification claims	6 months Tier 1	Direct browser visit; certification claims verified on vendor's trust page or compliance documentation
About page	Founders, founding year, headquarters, team size	12 months Tier 1	Direct browser visit; cross-checked against LinkedIn vendor profile and press releases

Failure modes we've observed and guard against:

Vendors A/B testing pricing pages with different cohort-specific pricing → we capture from a clean session without prior cookies
Vendors showing default-region currency that differs from the buyer's actual purchase currency → we capture native USD where the vendor supports it, и flag conversion math otherwise
Vendors burying tier limits inside expandable accordions or tooltip overlays → we expand all UI elements before capture
Vendors using "contact for pricing" framing on lower-tier plans for SMB segments → we capture this as a methodology note, not as missing data

Third-party aggregator pages (G2, Capterra, TrustPilot)

The canonical source for user-voice claims — review counts, star ratings, sub-rating breakdowns, recurring strengths and weaknesses patterns. Used in the User-Feedback-Patterns section of every Tier 1 review.

Aggregator	URL pattern	Refresh cadence	What we extract
G2	g2.com/products//reviews	6 months Tier 1	Star rating (overall), review count, sub-rating breakdown (Ease of Use, Features, Customer Service, Likelihood to Recommend), recurring strengths/weaknesses themes
Capterra	capterra.com/p///	6 months Tier 1	Star rating, review count, sub-ratings (Ease of Use, Features, Value for Money, Customer Service), Capterra's vendor-rating breakdown
TrustPilot	trustpilot.com/review/	6 months Tier 1	Star rating, review count, sentiment trend, complaint categories
Reddit (r/)	reddit.com/r/	Spot check at refresh	Recent community discussion themes; supplementary signal only
Product Hunt	producthunt.com/products/	12 months Tier 1	Upvote count, launch context (signal of vendor stage)
Reclame Aqui (Brazil-focused)	reclameaqui.com.br/	6 months for LATAM-priority reviews	Brazilian consumer complaint patterns; LATAM-priority signal

Failure modes we've observed:

G2 review counts in training data inflated 50× over actual (claimed 7,800 vs actual 163) → we always verify against current G2 page
TrustPilot ratings often diverge significantly from G2/Capterra (G2 4.6 vs TrustPilot 2.5) → we publish both rather than assuming parity
Capterra sub-rating breakdown not all dimensions visible publicly for all platforms → we publish what's visible, flag what's missing
Selective excerpting of aggregator reviews to manufacture sentiment → we report dominant signal across at least 6 months, not most flattering or damning excerpts

Ahrefs API

The canonical source for multi-locale brand search volume — the primary input to platform popularity scoring (dimension 15 in the rubric).

Endpoint	Used for	Cost	Refresh cadence
`keywords-explorer-volume-by-country`	Per-country brand search volume across 10 target locales	API units (Standard plan: 50k units/month, ~10% usage)	90 days Tier 1, 30-day cache during active review writing
`keywords-explorer-keyword-difficulty`	KD signal for category and best-of list keyword targeting	API units	Pre-drafting for new pages; cached 30 days
`serp-overview`	SERP feature presence	API units	Spot check pre-drafting

Target locales: United States, Brazil, Mexico, Spain, Argentina, Colombia, India, United Kingdom, Germany, France. Multi-locale aggregation matters because US-only ranking severely distorts LATAM priority — Manychat brand vol in LATAM is 3× the US figure, SendPulse is 11× US, Typebot is 32× Brazil vs US, AiSensy is 86× India vs US. We use the volume-by-country endpoint before Tier assignment decisions to avoid US-centric distortion.

Failure modes we guard against:

Memory-rule-derived heuristics ("LATAM is typically 11× US for chatbot platforms") used to derive per-country numbers without fresh API query → we query the API fresh for every brand vol claim per Rule 10
API caching staleness when brand vol shifts between scheduled refreshes → 30-day cache TTL during active review writing
Per-country data missing for low-volume brands → flag as "below measurable threshold" rather than report a zero

Meta Business Partner Directory

The canonical source for WhatsApp Business Solution Provider (BSP) partnership claims. BSP status is one of the 14 critical pros/cons triggers and materially affects WhatsApp template approval timelines (24-48 hours for BSP partners vs 5-7 days for non-partners).

Source	URL	Refresh cadence	What we verify
Meta Business Partner Directory	facebook.com/business/partner-directory	6 months Tier 1, on-demand if vendor claims BSP status change	Current partner listing, partner tier (Verified Partner / Solution Provider / Premier), region availability
Vendor's WhatsApp product page	.com/whatsapp	6 months	Vendor self-claimed BSP status and "expedited template approval" framing

Verification protocol: We cite the Meta Business Partner Directory listing URL in any review that claims BSP status. Where Meta directory access is intermittent or the vendor's listing has changed tier, we soften prose to "vendor-claimed" with verification-depth disclosure rather than rely on vendor self-claim alone.

Other partner directories

Partner program	Directory URL	Used for	Refresh cadence
Google Cloud Partner Directory	cloud.google.com/find-a-partner	Google Cloud / Marketing partnership claims	6 months on-demand
AWS Partner Network	aws.amazon.com/partners	AWS partnership claims	6 months on-demand
HubSpot Solutions Directory	ecosystem.hubspot.com/marketplace/solutions	HubSpot partnership claims	6 months on-demand
Shopify Partner Directory	partners.shopify.com/directory	Shopify partnership claims (relevant for ecommerce-bot platforms)	6 months on-demand
TikTok for Business Partner Directory	tiktok.com/business/marketing-partners	TikTok Marketing Partner status	6 months on-demand

Partner-status claims that cannot be verified against the relevant directory are softened to "vendor-claimed" with verification-depth disclosure per Rule 9.

Crunchbase / PitchBook / vendor press releases

The canonical source for funding totals, vendor stability claims, and corporate-relationship facts (acquisitions, board changes, executive moves).

Source	URL	Used for	Cost
Crunchbase	crunchbase.com/organization/	Funding rounds, lead investors, total raised, executive bio	Free tier access for basic profile; paid tier for full funding round details
PitchBook	pitchbook.com/profiles/company/	Detailed funding round data, comparable transactions	Paid tier required
Vendor press releases	Vendor news/press section	Recent funding announcements, partnership announcements, certification announcements	Free
TechCrunch / The Information / Business Insider	techcrunch.com, theinformation.com, businessinsider.com	Cross-verification of funding announcements, acquisitions, executive moves	Free for TechCrunch; paid subscriptions for The Information and Business Insider

Failure modes: Funding round details in training data are typically months or years stale. We verify any funding total claim against the most recent press release or Crunchbase entry, with date stamped.

Hands-on testing observations

First-party data captured during the six-scenario testing protocol. Distinct from vendor or third-party claims — these are measurements we made directly on paid-tier accounts.

Scenario	Measurement	Recording method	Refresh cadence
A — Basic FAQ bot	Time-to-first-bot in minutes, 20-query intent accuracy %, friction rating 1-5	Timer + standardized test query battery + qualitative UX notes	6 months Tier 1
B — Lead capture + Sheets sync	Setup time, data fidelity %, friction rating	Timer + round-trip test submissions + qualitative UX notes	6 months Tier 1
C — WhatsApp commerce flow	Setup time, BSP template approval hours, friction rating	Timer + wall-clock template approval tracking + qualitative UX notes	6 months Tier 1
D — AI knowledge base multi-language	Intent accuracy %, citation rate %, hallucination rate % per language	20-query test battery in EN + ES + PT-BR + market-relevant fourth language	6 months Tier 1
E — Human handover	Context-transfer friction 1-5, role-based-access correctness 1-5, multi-user inbox UX 1-5	Qualitative UX assessment + role permission test	6 months Tier 1
F — Analytics + ad-conversion tracking	Per-area score 1-5 (dashboard, funnel builder, CSV export, ad-conversion tracking)	Qualitative dashboard audit	6 months Tier 1

Test environment: Chrome on macOS, viewport 1440×900 at 2× retina. Primary locale English with secondary tests in Spanish (LATAM), Brazilian Portuguese, and a market-relevant fourth language. Test accounts created via standard public signup flow.

Inference basis for non-hands-on scenarios: Where direct hands-on measurement is not feasible (vendor demo-gates the account, paid tier blocks specific scenario testing, infrastructure access is restricted), scenario outcomes are anchored to vendor positioning, third-party reviewer reports + G2/Capterra recurring themes, comparable-platform measured benchmarks from existing reviews, and current-generation LLM capability patterns. The inference basis for each non-hands-on observation is documented in the review's POC notes sibling file. See /methodology/review-standards Rule 13 for the standardized format.

Internal data infrastructure

Data	Purpose	Storage	Access
`data/market-pricing-data.csv`	Cross-platform pricing dataset for VfM scoring and comparison engines	Repo CSV	Public-facing comparison outputs derive from this; raw CSV is internal infrastructure
Platform DB (Postgres / Neon)	Structured platform records, scoring fields, critical-trigger tracker	Postgres	Backend; surfaces to publishable reviews via component rendering
Image library	Watermarked, originality-tier-tagged screenshots from hands-on testing	Cloud storage	Surfaces to publishable reviews via image components
Brand vol cache	30-day Ahrefs response cache during active review writing	Cloud storage	Internal; refreshed automatically

Source authority and cross-verification

Single-source claims are not sufficient. Every material claim in a Tier 1 review is cross-verified across at least two independent sources where the claim type supports cross-verification. Pricing: vendor pricing page + market-pricing-data CSV. Channels: vendor channels page + vendor pricing page + hands-on observation. AI capabilities: vendor AI product page + vendor help center + hands-on testing observation. Partnership status: vendor self-claim + relevant partner directory.

Vendor self-claim is not sufficient for partnership badges, certifications, or aggregate user metrics. Per Rule 9 of the review hygiene standard, vendor-claimed partnership status without external verification is softened to "vendor-claimed" framing with verification-depth disclosure. Per Rule 6, vendor-claimed aggregate scores (review counts, ratings) are always cross-verified against the relevant aggregator.

Training data is never a valid source. Per Rules 6 and 10, training-data-derived claims (memory, "I recall reading", "typical for vendors in this category") are not acceptable substitutes for primary-source verification. Where primary source is unavailable, the claim is removed or labeled as estimate with methodology disclosure.

Source refresh tracking

Every Chatbotscape page that depends on external sources carries "Last verified" date fields per source type in its frontmatter:

last_verified:
  pricing: 2026-05-24
  features: 2026-05-24
  channels: 2026-05-24
  integrations: 2026-05-24
  certifications: 2026-05-24
  g2_score: 2026-05-25
  capterra_score: 2026-05-25
  trustpilot_score: 2026-05-25
  ahrefs_brand_vol: 2026-05-20
  meta_bsp_directory: 2026-05-25

The page footer surfaces the oldest "Last verified" date as the page's effective freshness signal. Where any date is more than 6 months stale, the page carries a staleness banner pointing the reader to either the next scheduled refresh date or the editorial review process contact form.

Version history of this page

2026-05-26 (v1.0) — Initial publication aligned to methodology v3.12.1. Sources catalog formalized after the Tier 1 anchor review batch demonstrated the need for explicit per-source refresh-cadence documentation and failure-mode disclosure.

Methodology overview — full scoring rubric and source weights
Review standards — 14 pre-publish quality gates including source verification
Update policy — refresh cadences and version history protocol
Editorial policy — fact-verification standards