Real-time in the Composable CDP
Competitors claim that Composable CDPs can’t solve real-time use cases. It’s time to bust that myth.
Jordan Pappas
September 4, 2024
6 minutes
If you’ve evaluated a customer data platform (CDP) anytime in the last year, you’ve likely heard that Composable CDPs like Hightouch can activate data directly from your data warehouse. Composable CDPs are gaining traction rapidly, due to massive advantages in security, ownership, and flexibility. The most common (and confusing) objection we hear, however, is “the Composable CDP doesn’t work for real-time.”
This objection is not true, though it has traces of historical accuracy. Data warehouses were originally built to process data in batches and not in real-time. However, with advances in both data warehouse technology and features that we’ve built for our customers, the Composable CDP can absolutely deliver real-time experiences, often with much more elegant, focused solutions.
In this blog, we’ll dive into some common use cases where real-time data matters and how we solve them.
When does real-time data matter?
In short: real-time capabilities matter when you need to reach a customer with an experience immediately (or shortly after) they complete a particular action. What counts as “fast enough” depends entirely on your use case. There’s no universal definition for “real-time;” interpretations of this phrase range anywhere from milliseconds to hourly.
Common low-latency or “real-time” use cases include:
- Trigger-based messaging: Send emails or texts immediately, based on some new transaction or user event. For example, send an order confirmation email immediately after a purchase (should be done within seconds) or kick off an onboarding sequence right when a new user signs up.
- Onsite next-session personalization: Change a user's experience as they browse a website or app based on their attributes or data from prior visits. Strategies could include dynamic product recommendations or customized search results. If you have historical data on a customer, this is often related to real-time retrieval vs. real-time processing (for example, Netflix could pre-load recommended movies to a logged-in user, and these recommendations could be retrieved for each user upon login).
- Onsite same-session personalization: Change a user’s experience as they browse a website or app based on their actions during that visit. For example, a media company could change advertising content, or an ecommerce company could display pricing dynamically based on specific browsing behavior. This type of personalization requires sub-second latency.
- Industry-specific operations: Complete actions that matter for your particular industry. Airlines need to send real-time alerts for gate changes and flight updates. Food delivery companies need to do the same for order updates and to tell customers that drivers have arrived. The exact channels and timing of these actions fully depends on the use cases that matter to you!
It’s worth noting that plenty of use cases actually are better served when you don’t solve them in real-time. When you take more time to merge and enrich data in the data warehouse, you can improve quality. For example, you get better match rates (and therefore performance) when syncing audiences and conversion events to ad platforms that have as many correct identifiers and customer attributes as possible. Instead of streaming these conversion events in real-time directly to the ad platforms, it’s more valuable to enrich these events within your warehouse before activating them downstream.
Ultimately, you should evaluate each of your data use cases to determine which ones actually benefit from real-time processing. We’ve seen that the majority of business use cases don’t need truly real-time solutions and may perform better (or run at a lower expense) with some intentional batch-based processing.
- Determine the latency you need for each use case. Is it best solved in milliseconds, seconds, or longer?
- For truly “real-time” use cases, you should run a live test with vendors of how quickly and effectively they solve them. Since so many vendors make dubious statements about “real-time” capabilities (see also: Salesforce lawsuit), you should rigorously test a solution before buying it.
How Hightouch solves real-time use cases
Since “real-time” use cases are so varied, we’ve built several solutions that can solve anything from sub-second to multiple-minute latencies. Our three primary solutions for real-time actions are event streaming, streaming Reverse ETL, and a Personalization API.
Choose the best way to use your data in real-time, based on your specific use cases.
Real-time solution #1: Event streaming
Event streaming (or “event forwarding”) is part of our event collection product, and is comparable to event streaming features in many traditional CDPs. When users complete actions on your website or mobile app, event streaming can forward them directly to downstream destinations like Iterable or Meta, in less than a second. This same solution also supports any back-end triggers, webhooks from other systems, or even natively plugging into streaming data sources such as Apache Kafka.
Event streaming is best for actions that need to be triggered immediately based on user behavior, such as trigger-based messaging.
Real-time solution #2: Streaming reverse ETL
Streaming Reverse ETL syncs data from data warehouses to downstream tools, rather than directly from websites or mobile apps. This allows you to process data in the warehouse to clean, model, and otherwise enrich it. The key innovations that support high-speed (1-2 minute latencies) data delivery are modern data warehouse features that enable warehouse modeling on a rolling and streaming basis, like Snowflake’s Dynamic Tables, Databricks Delta Tables, or Google BigQuery’s Continuous Queries.
Streaming Reverse ETL is best suited for actions that you want to take quickly based on user behavior and that benefit from modeling and enrichment in the data warehouse, such as some trigger-based messages and sending conversion events to advertising platforms.
Real-time solution #3: Personalization API
We built a Personalization API to provide an “always-ready” endpoint that your website or mobile app can call out to in order to power in-session personalization (for example, recommended products). It works by allowing you to proactively compute any SQL query, audience, or computed attribute in your data warehouse, then “cache” those results so they’re ready for the API to access on-demand, at sub-second latencies.
The Personalization API is best suited for onsite next-session personalization because it allows you to change onsite or in-app experiences based on previously modeled data about a customer.
The Composable CDP 🤝 Real Time
Hightouch is building the Composable Customer Data Platform centered on the data warehouse. With these innovations, the Composable CDP will increasingly support real-time customer data needs. Streaming has been one of the last barriers blocking some marketing teams from using the warehouse to power their operations, so we’re thrilled to share these innovations that will make the warehouse more useful, and real-time, than ever.
If you’d like to explore how you can turn your data warehouse into a source of value for your marketing, sales, and operations teams, book a meeting with our solutions team.