[Paper Review] Machine Learning Operations: Overview, Definition, and Architecture

Abstract

  • The final goal of all the industrial machine learning projects is to develop ML products and then bring them into production. But it is highly challenging to automate and operationalize ML products.
  • This paper provides a guide for ML researchers and practitioners who want to automate and operate ML products with a designed set of technologies.

Introduction

  • A Large number of ML products fail just because of the production work.
  • From the research point of view, this was not a surprise for all of us as most of the ML work has been done on making good ML models rather than
  • Focusing on production-ready ML products.
  • Proving the necessary coordination of the resulting often complex ML system components and infrastructure including the roles required to automate and operate an ML system in a real-world setting.
  • For instance, in most places, Data Scientists still manage the ML workflows to a great extent, resulting in many issues during the respective ML solutions.
  • While researchers shed some light on various specific aspects of MLOPs, some are still missing like:-
  • Holistic conceptualization
  • Generalization
  • Clarification of ML systems designs

Foundations of DevOps

  • DevOps is more than a pure methodology and rather represents a paradigm addressing social and technical issues in organizations engaged in software development.
  • It has a goal of eliminating the gap between development and operations and emphasizes collaboration, communication, and knowledge sharing.
  • It ensures automation with continuous integration and continuous delivery(CI/CD). Moreover, it is designed for continuous testing, quality assurance, continuous monitoring, logging, and feedback loops.
  • Cloud platforms are equipped with ready-to-use DevOps tooling that is designed for cloud use.
  • Empirical results show that DevOps ensures better software quality.

Methodology

Literature Review

  • For this paper, they followed Webster and Watson, and Kitchenham's method for review.
  • After the initial search they used terms like (((”DevOps” OR “CICD” OR “Continuous Integration” OR “Continuous Delivery” OR “Continuous Deployment”) AND “Machine Learning”) OR “MLOps” OR “CD4ML”). They use this as a query over the databases like Google Scholar, Web of Science, etc.
  • After the search in May 2021 they got around 1864 articles. Out of those they screened 194 papers in total and from that they got around 27 articles that suit what they were searching for.

Tool Review

  • In this they reviewed various open source tools, frameworks and commercial cloud ML services to gain technical domain knowledge.

Interview Study

  • To gain insights from various perspectives, we choose interview partners from different organizations and industries, different countries and nationalities. as well as different genders.
  • In total, they conducted around 8 interviews.

Results

Principles

  • It is simply the guide to how things should be realized in MLOps or in simple words we can say best practices that we have to follow.
  • CI/CD automation
  • Workflow orchestration
  • Reproducibility
  • Versioning
  • Collaboration
  • Continuous ML training and evaluation
  • ML Metadata tracking/logging
  • Continuous Monitoring
  • Feedback Loops

Components

  • After identifying the principles that need to be incorporated into MLOps, we now elaborate on the precise component and implement them into ML system design.
  • We have around 9 ML Components:-
  • CI/CD Components (P1, P6, P9)
  • Source Code Repository(P4, P5)
  • Workflow Orchestration(P2,P3,P6)
  • Feature Store System(P3, P4)
  • Model Training Infrastructure(P6)
  • Model Registry(P3,P4)
  • ML Metadata Stores(P4,P7)
  • Model Serving Component(P1)
  • Monitoring Component(P8, P9)

Roles

  • Business Stakeholder
  • Solution Architect
  • Data Scientist
  • Data Engineer
  • Software Engineer
  • DevOps Engineer
  • ML Engineer/ MLOps Engineer

Architecture and Workflow

  • On the basis of principles, components, and roles they have designed an end-to-end architecture and workflow for ml researchers and practitioners.
  • The artifact was designed to be technology-agnostic. Therefore, ML researchers and practitioners can choose the best fitting technologies and frameworks for their needs
  1. MLOps project intuition
  2. Requirements for feature engineering pipeline
  3. Feature engineering pipeline
  4. Experimentation
  5. Automated ML Workflow Pipeline

Summary

--

--

My skills include Data Analysis, Data Visualization, Machine learning, and Deep Learning. I have developed a strong acumen for problem-solving, and I enjoy ML.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Mohit Mishra

My skills include Data Analysis, Data Visualization, Machine learning, and Deep Learning. I have developed a strong acumen for problem-solving, and I enjoy ML.