SEATTLE, Dec. 03, 2020 (GLOBE NEWSWIRE) — Today at the Apache TVM and Deep Learning Compilation Conference, OctoML, the MLOps automation company for superior model performance, portability and productivity, announced early access to Octomizer. Octomizer brings the power and potential of Apache TVM, an open source deep learning compiler project that is becoming a de facto industry standard, to machine learning engineers challenged by model deployment timelines, inferencing and throughput performance issues or high inferencing cloud costs.
Industry analysts estimate that machine learning model costs will double from $50 billion in 2020 to more than $100 billion by 2024. Many machine learning models put into production today cost hundreds of thousands to millions of dollars to train, and training costs represent only a fraction of the ongoing inferencing costs that businesses take on to provide cutting-edge capabilities to their end users.
“In our early engagements with some of the world’s leading tech companies, they’ve been excited about our ability to provide unparalleled model performance improvement,” said Luis Ceze, OctoML co-founder and CEO. “Now we’re excited to open the Octomizer up for early access to a select set of customers and partners with similar model performance, inferencing cost savings or edge deployment needs.”
OctoML has demonstrated the potential of the Octomizer with early customer engagements across model architectures and hardware targets. OctoML’s early partners include Computer Vision (CV) and Natural Language Processing (NLP) machine learning teams focused on improving model performance on various targets such as NVIDIA’s V100, K80, and T4 GPU platforms, Intel’s Cascade Lake, Skylake, and Broadwell x86 CPUs, and AMD’s EPYC Rome x86 CPUs. Model performance improvements were at an order-of-magnitude level – for example, a Computer Vision based team worked with OctoML to decrease model latency from 95 milliseconds to 10 milliseconds, unlocking higher throughput and enabling new product feature development.
Accessible through both a SaaS platform and API, the Octomizer accepts serialized models, enables users to select specific hardware targets, and losslessly optimizes and packages models for the selected hardware. By making use of TVM’s state-of-the-art technical performance capabilities, the Octomizer can deliver up to 10 times model performance improvements, enabling deep learning teams to improve model performance, cut inferencing costs, and reduce time and effort for model deployment.
The Octomizer currently makes available all cloud-based CPU and GPU as well as ARM A-class hardware targets, with additional hardware targets identified for early 2021.
As part of its enterprise offerings, OctoML also provides customer-specific hardware target onboarding, which enables internal performance testing and benchmarking and vendor-specific model optimization.
About Apache TVM
Apache TVM is an open source deep learning compiler and runtime that optimizes the performance of machine learning models across a multitude of processor types, including CPUs, GPUs, accelerators and mobile/edge chips. It uses machine learning to optimize and compile models for deep learning applications, closing the gap between productivity-focused deep learning frameworks and performance-oriented hardware backends. It is used by some of the world’s biggest companies like Amazon, AMD, ARM, Facebook, Intel, Microsoft and Qualcomm.
About the Apache TVM and Deep Learning Compilation Conference
The 3rd Annual Apache TVM and Deep Learning Compilation Conference is covering the state-of-the-art of deep learning compilation and optimization and recent advances in frameworks, compilers, systems and architecture support, security, training and hardware acceleration. Speakers include technology leaders from Alibaba, Amazon, AMD, ARM, Bosch, Microsoft, NTT, OctoML, Qualcomm, Sima.ai and Xilinx, as well as researchers from Beihang University, Carnegie Mellon University, Cornell, National Tsing-Hua University (Taiwan), UCLA, University of California at Berkeley, University of Toronto and University of Washington. The free virtual conference is taking place Dec. 2-4: https://tvmconf.org.
OctoML applies cutting-edge machine learning-based automation to make it easier and faster for machine learning teams to put high-performance machine learning models into production on any hardware. OctoML, founded by the creators of the Apache TVM machine learning compiler project, offers seamless optimization and deployment of machine learning models as a managed service. For more information, visit https://octoml.ai or follow @octoml.