In principle, the cloud is becoming a commodity. Google, Amazon and other companies are trying to commoditize almost everything, including the instruments for data gathering, data storage and data transformation. In the near term, we should expect that platform engineering, which today is overly complex, will become simpler.
With data engineering, however, the prospects of such simplification in the near future is not that clear. Currently there is no single simple solution even for ETL (extract transform load), such as the Amazon Data Pipelines or Amazon Glue (which hasn’t been fully released). They represent some very humble beginnings of what people are building up right now brick by brick. Data engineers and architects and other professionals are building these to solve real-life cases.
From the data science point of view, new machine learning-as-a-service solutions are popping up like mushrooms after a spring rain. There are a multitude of them on the market: Big ML, Data Robot, Azure ML, Amazon ML, IBM Watson, Google Prediction and so on. But what is machine learning as a service (MLaaS)?
Let’s look at IBM Watson Analytics as an example:
MLaaS is a set of preset wizards to create machine learning models and use these models through API. For example, we need to recognize text and separate some facts and data out of it. To do that, we initiate the service in a couple of clicks, insert the original text and get back a marked out and analyzed copy. In this instance, the service is commoditizing NLP (natural language processing). The same goes for VTT (voice-to-text): We insert an audio file and get the text version of the talk back.
There are other uses of machine learning technology:
Machine learning closely works with data. But it is one thing to have data, and another to have actionable data with which you can help users. As seen from the chart, above one of the key trends is the commoditization of data services.
These basic services do what they are supposed to, and this is great. And you don’t need to employ a data scientist to do some of these tasks. But if you are a company like Grammarly, whose know-how is in actual text recognition, provision of recommendations and text analytics, then ready-made services are insufficient. Grammarly, and companies like it, have to take a hands-on approach that will be very expensive. This is their core business and using third-party services is not viable.
So how would you go about using ML cloud services?
Even taking into account a large number of “as a service” solutions, the main challenge is in navigating and understanding “what is better” and “what is best,” and “how it all works.” The issue is that such services are constantly changing. The best choice for you may be to employ tools-consulting services. It’s imperative to understand what is available now and what tools would best suit your business. This is the most basic but essential help you may need.
Secondly, integrating your operations with these services requires particular engineering efforts. Indeed, most of the data science tasks right now are solved through these cloud “as a service” solutions. But having a proper consulting company with a strong engineering background can help you better utilize such ready-to-use solutions. They offer value to the business at a faster pace. There is no need to invent the wheel, especially when your consultant knows what and where to use.
If you decide to go to a data science firm where the majority of employees are data scientists, they will create for you some unique genius custom solutions, do great research and so on and so forth. This is what they are good at. This is their bread and butter.
An end-to-end consulting company may be a better choice for your company, as its goal is to bring value to your business. Consultants will tell you if MLaaS solutions are appropriate for your machine learning needs, and will have a deep understanding of how these services work. A consultancy company also should create and build proper architecture, and provide custom development work that is unique to your services for your business to scale.
Finally, evolution is key if you plan to grow your business. Start with the ready-made solutions, then in time switch to in-house solutions that can be done with the help of a group of data scientists.
About the Author / Stanislav Ivashchenko
Stanislav Ivashchenko is the Senior DevOps Сonsultant at SQUADEX. A certified AWS solutions architect, he is experienced in building, delivering, supporting and optimizing a broad range of software applications. From bare metal to the cloud, from monolith architecture to microservices, he helps improve software delivery processes in startups, mid-sized companies and large enterprises. He has solid skills in orchestrating various DevOps and infrastructure related tools, services and management practices. Connect with him on LinkedIn.