What is Salesforce Data Cloud?
Is Salesforce Data Cloud a viable option for customer data use cases? Should you buy it?
Luke Kline
July 9, 2024
10 minutes
If you’ve been in the MarTech space long enough, then you know that Salesforce has been trying to expand their product offerings into the customer data platform (CDP) world for some time. And the newest product in this category is Salesforce Data Cloud. So the question is: What exactly is Salesforce Data Cloud? Why is it so hyped? And is the platform a viable option?
If you’re evaluating enterprise CDPs, then this blog post will tell you everything you need to know about Salesforce Data Cloud, including:
- What is Salesforce Data Cloud?
- How Does Salesforce Data Cloud Work?
- The Architecture Behind Salesforce Data Cloud
- Salesforce Data Cloud Features
- How Much Does Salesforce Data Cloud Cost?
- Should You Buy Salesforce Data Cloud?
What is Salesforce Data Cloud?
Salesforce Data Cloud is a data lake that underpins the entire Salesforce ecosystem, which helps you consolidate all of your customer data together so you can create unified customer profiles. The entire purpose of this product is to give your business teams access to more customer data so they can power personalized experiences.
Most Salesforce products operate as their own separate entities with their own separate architectures. The goal of this new platform is to integrate all of your Salesforce data into one platform so you can more easily access and share data between applications in the Salesforce ecosystem.
How Does Salesforce Data Cloud Work?
Is Salesforce Data Cloud a CDP? The short answer is yes, but it’s also not quite so straightforward. In fact, the old name for Data Cloud used to be Salesforce CDP, but the product was rebranded to align with newer product capabilities.
While the platform can do a lot, it works very similarly to other CDPs in that it helps you collect, model, segment, and activate data. Salesforce Data Cloud has pre-built connectors to help you bring in your data from other sources. This capability is powered by native Salesforce connectors, which enable you to establish secure connections with other Salesforce applications or external sources. Once you’ve established these connections, you can create and configure data streams to define what data points you bring into Salesforce Data Cloud.
After your data is ingested into Salesforce Data Cloud, you can transform and join it together to build comprehensive customer profiles and audience segments, which you can then send back to your native Salesforce applications like Marketing Cloud to power downstream operational use cases. However, to actually make use of the marketer features that Salesforce Data Cloud provides, all of your underlying data has to be mapped to the Standard Salesforce Customer 360 Data Model, which can make things challenging, especially if your business has proprietary data and custom entities/objects.
Salesforce Data Cloud has a lot of capabilities, but you can really synthesize everything into five key functions:
- Data Collection: Bringing data into Salesforce Data Cloud and storing it
- Data Transformation: Merging, joining, filtering, cleansing, normalizing, and reshaping data to map back to the Salesforce data model.
- Identity Resolution: Unifying and resolving customer profiles
- Segmentation & Audience Building: Building and segmenting users based on attributes and traits using an audience manager
- Data Activation: Syncing audience segments and data points to other Salesforce applications
The Architecture Behind Salesforce Data Cloud
What most people don’t realize is that most Salesforce applications are built on entirely different and isolated back-end architectures because many popular Salesforce products were purchased and rebranded under the Salesforce umbrella. Architecturally, this makes it very difficult to move data back and forth between systems unless you use some sort of integration tool like Mulseoft.
Salesforce Data Cloud is unique in that the entire underlying architecture is built around Apache Iceberg, which is an open-sourced table structure that reduces data storage costs and allows data to be accessed more efficiently. The general idea behind Iceberg is that you can decouple computing and storage and use the query engine of your choice to manipulate your data.
The value here is that this architecture allows Salesforce to more easily share data from Salesforce Data Cloud to data warehouses or, more importantly, bring data into Data Cloud from data warehouses. To enable this bi-directional data sharing there are two critical components which fit within this architecture: zero-copy integration and bring your own lake (BYOL) data federation.
Understanding How Zero-Copy Works
Zero-Copy Integration is the process of virtualizing your Salesforce data in a data warehouse so you can query against underlying data in Salesforce directly from your warehouse without actually storing a physical copy in your data warehouse. Put simply, this zero-copy integration uses a Salesforce-managed data warehouse to take advantage of native data-sharing capabilities that data warehouses like Snowflake provide. Metadata from your Iceberg tables is then shared with your data warehouse to create virtual tables, which you can query at any time, but the underlying data always lives in the Data Cloud.
Understanding Bring Your Own Data Lake (BYOL) Federation
BYOL Data Federation is the opposite of zero-copy. It’s the process of virtualizing your warehouse data in Salesforce Data Cloud. Put simply, Salesforce Data Cloud mounts the tables from your warehouse as an external data lake object in the platform. This external data lake object simply acts as a storage container that houses metadata and points to the actual physical data in your warehouse. With this capability, instead of querying against Salesforce Data Cloud, you can use Salesforce Data Cloud to query the underlying data in your data warehouse.
Bear in mind there’s a lot more complexity that goes in under the hood as it relates to the actual inner workings of the bi-directional sharing capabilities that Salesforce offers, and the company is very mute on the actual performance of these capabilities. While the whole purpose of BYOL Data Federation is to eliminate data copies, as soon as you’ve finished processing that query, the resulting data is persisted and stored in Data Cloud, and if you want to use that data across other Salesforce applications, then you’ll also need to copy it to that application by publishing sending your audiences via the platform’s activation capabilities. So, while Salesforce claims that this entire architecture is “zero-copy,” the reality is that many copies actually exist.
This complex architecture is one of the main reasons companies are choosing to implement a Composable CDP and activate data directly from their existing data warehouse rather than implement Salesforce Data Cloud.
Salesforce Data Cloud Features
Salesforce Data Cloud comes with a lot of bells and whistles for data and marketing teams, which can make understanding the platform challenging. With that in mind, here’s a breakdown of the most important concepts and terms related to Salesforce Data Cloud.
Feature | Overview |
---|---|
Data Sources | Native connections to existing Salesforce applications, web and mobile SDKs, or 3rd-party integrations to bring data into Salesforce Data Cloud |
Data Streams | The actual data that flows into the Salesforce Data Cloud |
Salesforce Data Pipelines | A visual UI and native library of functions and operators to transform your data so it can be mapped back to the Salesforce model |
Data Spaces | Logical data partitions that help you organize and govern permissions and user access from other brands, regions, or departments |
Data Lake Objects (DLOs) | A storage container for data ingested into Salesforce Data Cloud |
Customer 360 Data Model | The logical hierarchy and schema structure that defines the standard objects, fields, metadata, and relationships of your data |
Data Model Objects (DMOs) | The specific components and building blocks of data within the Salesforce Customer 360 Data Model |
Data Mapping | The process mapping data from DLOs to DMOs so you can use it for segmentation and activation across other downstream Salesforce applications |
External Data Lake Objects | A storage container for data federated from external data sources, which acts as a reference to access data stored outside of Salesforce Data Cloud |
Segments | A custom audience that you define in the audience manager that shares a common set of objects, characteristics, or insights |
Activation Targets | The supported destinations or endpoints which you can send published segments and audiences to |
BYOL Data Federation | The process of virtualizing your warehouse data in Salesforce Data Cloud without creating another physical copy of it |
Zero-Copy | The process of virtualizing your Salesforce data in your data warehouse so you can query it directly in your warehouse without storing a physical copy of it |
Accelerated Data Federation | A feature that lets you improve query performance when accessing data from external data warehouses to reduce latency and improve data refresh intervals. |
How Much Does Salesforce Data Cloud Cost?
On paper, pricing in the Salesforce Data Cloud looks relatively straightforward, but under the hood, it’s not quite so simple. The baseline price for the starter version of Data Cloud is $108,000 a year. Pricing within the platform is linked to a consumption-based model: you purchase credits in Salesforce Data Cloud, which are then consumed upon use when you use various features in the platform. The catch is that it’s extremely easy to burn through credits because nearly every feature consumes them.
Additionally, other core CDP features like audience building, activation, and support for ad destinations are not natively available. These products are only accessible as add-ons. And even worse, data storage starts at a whopping $1,800 per terabyte, which is quite expensive, given that traditional data warehouses usually charge around $20 per terabyte of storage.
Should You Buy Salesforce Data Cloud?
Buying Salesforce Data Cloud might seem like a great idea if you're an existing Salesforce customer. However, the platform simply doesn’t work the way it’s presented, and there are many underlying problems and shortcomings with it–namely, it’s really expensive, it supports limited destinations outside of the Salesforce ecosystem, transforming and modeling data is very difficult, and all of your data has to fit within the Salesforce Customer 360 Data Model.
Modern companies have increasingly complex data and marketing use cases, and for most organizations, all of their customer data already lives in a data warehouse. It simply doesn’t make sense to maintain two sources of truth and model data in two separate places. You’re never going to convince a data team to do analytics or engineering work out of Salesforce.
With a Composable CDP like Hightouch, you own your data and manage it directly in your infrastructure. This means you can leverage all of your custom entities that are specific to your business (e.g., accounts, subscriptions, products, pets, households, etc.) without having to conform your data to the constraints of Salesforce. Fundamentally, this means you can avoid vendor lock-in and unnecessary costs so you can unlock far better flexibility and security benefits and deliver richer personalization using all of your data.
If you’re interested in learning more about how companies like Petsmart and Warner Music Group are using the Composable CDP to power their marketing use cases, then book a demo with one of our solution engineers today!