A community developing a Hugging Face for modeling customer data

Sponsored article

Objectiv is a community-driven, open-source project that aims to do for customer data modeling what Hugging Face did for NLP: https://github.com/objectiv/objectiv-analytics

Customer data modeling is reinvented from scratch by each team

Starting with product analysis, each company/team uses their own custom tracking plans to define events and data structures. As a result, everyone follows the same learning curve to clean and prepare data, and to build analytics and machine learning models from scratch. The same goes for other key customer data inputs such as CRM, payments, marketing, etc.

Democratize customer data modeling

About a year ago, we created a community of 50 companies to develop a hugging face like open-source project for modeling customer data. Their main purpose: to enable the creation of data models on one team/company’s data set, and then run them seamlessly on another.

Step 1: Generalize input data for product analysis

The first step we have taken is to define a open analytic taxonomy: a detailed specification of the product analysis events to be tracked and how to structure this data. We defined and tested it with the use cases of our initial community of 50 companies, to make sure it fits their digital product and data modeling goals.

Step 2: Provide tools to capture validated and clean data

Since the taxonomy normalizes the input data, we were able to develop useful tools to help engineers implement tracking instrumentation. For instance, SDK with validation at IDE, runtime and CI level.

Step 3: Create a place to take models and run them transparently on your data

The initial community grew to over 300 data team members and with their input we created the open model hub: a collection of open-source models, ranging from typical product analytics use cases to predicting user behavior with machine learning. These models run directly on any dataset that follows the open analytical taxonomy. They are powered by a pandas-compatible python library that runs in any major laptop. All models currently work on both PostgreSQL and Google BigQuery, with Amazon Athena next, and other data stores to come.

A community developing a Hugging Face for modeling customer data

Next: more input data sources

After covering product analytics as an entry, the open analytics taxonomy and open models hub are now expanded to cover marketing, CRM, payment data, and more.

To be involved

If you like testing the project and contributing, check out the repository at https://github.com/objectiv/objectiv-analytics

Sam D. Gomez