Data Layer Privacy

Subsalt’s Generative Database allows you to safely and easily provision privacy-preserving data for machine learning, business intelligence, advanced analytics, and research

Compliance Embedded at the Data Source

Subsalt’s Generative Database guarantees privacy protections and regulatory compliance for existing data science, machine learning, and data sharing workflows and applications

Are you responsible for compliance with privacy regulations at your organization? Contact us to learn more 

Privacy-Preserving Synthetic Data

Synthetic data preserves statistical properties of sensitive data set, without exposing entities in the data

Delivered as Data Infrastructure
By serving as a data store, Subsalt provides privacy and governance advantages, enabling scalable data provisioning processes


Query-Time Quality Optimization

Data consumers receive data that has been proven to perform similarly to source data for their intended use case

Integration with Data Tools and Applications
Subsalt speaks SQL, so it’s easy to connect to your existing tools, machine learning libraries, and analytics pipelines

What is Synthetic Data?

Synthetic data is computer-generated data that mirrors the statistical properties and column structure of an underlying data set. It looks and feels like real data. It can be used to generate the same insights as your sensitive data. And because it's new, computer generated data, it guarantees privacy.


Why Synthetic Data?

Synthetic data preserves statistical properties of source data and preserves privacy for entities in the data set. This ensures compliance with privacy laws and unlocks risk-free data analysis and data sharing without restrictions on data use, destructive redactions, or expensive legal and compliance processes.


Why Not Synthetic Data?

Synthetic data isn't a viable replacement for real data in every case. For processes that require user or patient identity to be preserved, like support or clinical operations, real data is still required. But for data science and analytics, synthetic data is the best way to provision safe, private data. 

Empower Your Data Teams with Fast, Safe Access to Sensitive Data 

Eliminate long, expensive legal and compliance reviews and custom data engineering required to provision sensitive data. By making compliant data available at the infrastructure layer, Subsalt enables fast, consistent, scalable data provisioning.

Subsalt provides access to synthetic data proven to perform similarly to the source data for each analysis. This ensures that Subsalt's infrastructure can serve as shared data infrastructure for many data consumers.

Subsalt can interact with your existing data stack, from BI tools and ML libraries to Jupyter Notebooks and governance tools. By delivering row-level data access through standard interfaces, Subsalt is designed to introduce a compliant data source without requiring changes to downstream tools and systems.

Synthetic data produced by Subsalt's system is provably privacy-preserving. By ensuring this protection at the data source, downstream analysis can be performed without risking compliance with privacy regulations or disclosure of private information in the event of a breach.

Are you a data leader with data access challenges? Get in touch

How it Works

Connect to Your Data Source
Subsalt connects to an existing real data store to access the real data that will be analyzed to create Subsalt's hosted synthetic data assets.
Define Your Data Set
Data owners define the features in their synthetic data set and the table structure using SQL.
Subsalt Builds Synthetic Data Models
Subsalt accesses and builds statistical models that will be used to generate synthetic data. No Subsalt employee can ever access source data during processing and all real data is discarded after models are built.
Provision Access to Data Consumers
The data owner controls access to Subsalt's data store. Through a simple interface, they can provision access to any data consumer.
Consumers Query Subsalt Like Any Other Database
After receiving access to Subsalt from a data owner, data consumers receive authorization credentials that allow them to query a data table that has been proven to perform well for their intended use case.