3 Examples of AI at Work in DataOps

Artificial intelligence (AI) is making all the difference between innovators and laggards in the global marketplace. Yet, implementing a state-of-the-art DataOps operation involves a long-term commitment to putting in place the right people, processes and tools that will deliver results.

In this post, we look at three organizations that are doing cutting-edge work in the field of DataOps. We look at the specific strategies they use and the results they’ve seen as they navigate the uncharted waters of DataOps.

Uber: End-to-End DataOps

Uber is very vocal about the way it leverages AI across its apps and services globally. The company runs all its machine learning (ML) models on a platform called Michelangelo and use many other proprietary tools such as Manifold, Horovod and Piper to optimize and scale these operations.

Michelangelo helps manage DataOps in a way similar to DevOps by encouraging iterative development of models and democratizing the access to data and tools across teams. On top of this platform, Uber uses other tools to help data teams.

Uber ran its models in Tensorflow and realized model training was taking too long because of the large data sets it was dealing with. Taking a page from Facebook’s data team, the company realized the power of distributed training. This would enable Uber to run data operations in parallel on hundreds of distributed GPUs, rather than on a few GPUs. Yet, managing all this would be a challenge, which is why Uber built a tool called Horovod. Now under the auspices of The Linux Foundation’s Deep Learning community, Horovod supports additional frameworks such as PySpark, Petastorm and Apache MXNet. Additionally, using mixed precision training, Horovod is able to optimize the use of memory when running models, and in doing so, it enables running larger data models or running numerous mini-batches of data in parallel.

Manifold is a visualization tool that helps to gauge the performance of various models against each other. Rather than use a simplistic, two-dimensional axis to plot performance data, Manifold groups large quantities of model performance data into subsets that avoid duplicating the important indicators of performance. This reduces noise and helps speed analysis of the data. The lesson: Look beyond the trees to view the entire forest when it comes to metrics.

By adopting an end-to-end approach to DataOps and leveraging specialized tools at every step of the pipeline, Uber is pioneering the use of AI in its large-scale consumer applications.

Netflix: Offline First, Then Online

Content recommendation is key to the user experience in the Netflix app. Highly relevant content makes the experience personal and drives higher engagement. However, making this happen behind the scenes involves working with large quantities of data and overcoming obstacles every step of the way.

Netflix uses Spark to power content personalization. While the company has existing models in production, it always is testing new models and hypotheses to improve on existing ones. However, every new model needs to be tested against an older model and should be proven to be better before being implemented. Netflix does this model testing offline rather than online in the live app. The company makes a copy of historical data and performs offline feature generation on this data. The features are stored in a Hive store in AWS S3, and subsequent feature generation and model training is done in Spark. Once a new model is ready, it is run in parallel with a live model and, if it performs better, it is deployed.

Key to the success of this operation is the logging of historical facts from online to offline. This includes Netflix-specific facts such as video, user and computation facts. Once this historical data is stored, a data worker is able to run simulation tests against it using the newly developed models. These simulation tests are a great way to test models without breaking things and save time by failing fast. Once tested and compared with live models, Netflix goes on to establish different SLAs for different microservice apps that these models serve. All this effort results in a highly relevant recommendation engine and a very personalized experience for Netflix users.

Airbnb: Tapping Into Related Data Stores

Airbnb uses AI to improve the search rankings and relevance of its “AirBnB experiences” offering. The company started simply, only recording users’ daily searches, clicks and bookings. Soon, it created features on top of this raw data, including features such as experience duration, price of experience and user reviews, which would help build models. Airbnb used a ranking model to give weight to experiences that had a better conversion rate, better user reviews and a lower price than those with just a higher booking volume. This improved experience bookings by 13%.

Going further, Airbnb tapped into the other user data it had—homes booked by Airbnb users—to further personalize the ranking of experiences. This is where things get interesting, as AirBnB is able to show experiences that are more likely to match a user’s interests. This is only possible by looking beyond a single data store and looking for relevant data that may exist elsewhere in the organization. This personalization further improved experience booking by 7.9%.

Similar to Uber, Airbnb uses a combination of offline and online scoring for models. While the historical data is great to get started with and run basic simulation tests, online scoring unlocks additional features such as user location, search filters and login status that wouldn’t otherwise be available offline.


AI is changing everything, and DataOps is no exception. By helping companies to make sense of their data more quickly and with less reliance on human effort, AI drives faster, more consistent results.

The good news for most of us is you don’t need to be a major tech company such as Uber, Netflix or Airbnb to take advantage of AI-driven DataOps. Tools such as Unravel are making AI-powered data insights practical for organizations everywhere, without requiring them to build bespoke DataOps solutions.

This sponsored article was written on behalf of Unravel.

Twain Taylor

Twain Taylor

Twain is a Fixate IO contributor and began his career at Google, where, among other things, he was involved in technical support for the AdWords team. His work involved reviewing stack traces, resolving issues affecting both customers and the support team, and handling escalations. Later, he built branded social media applications and automation scripts to help startups better manage their marketing operations. Today, as a technology journalist he helps IT magazines and startups change the way teams build and ship applications.

Recent Posts

How to Improve Your Uptime Strategy

Outages happen, it’s inevitable. But, unplanned downtime often comes with substantial costs—not only in terms of recovery and revenue loss,…

2 hours ago

SRE (Part 1): A Modern Overview

Site Reliability Engineering (SRE) is a topic that over the last several years has become a popular discussion across many…

3 hours ago

How to Build a Home-Screen Worthy CX

I’ve built apps, in some shape or form, for about 20 years now. During this time, I’ve always started the…

3 hours ago

COVID-19 Response: Espressive Makes AI Help Desk Free for 90 Days

Espressive, a provider of a help desk software that makes extensive use of artificial intelligence (AI), announced this week it…

3 days ago

IBM awards its second $50,000 Open Source Community Grant to internship and mentorship program Outreachy

By Todd Moore and Guillermo Miranda Last October, the open source community at IBM awarded a first-of-its-kind quarterly grant to…

3 days ago

Filling the Skills Gap for Effective DevSecOps

With the rise of DevSecOps comes a whole new need for training and upskilling. It isn’t a secret that the…

3 days ago