Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • A awesome-python
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 13
    • Issues 13
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 317
    • Merge requests 317
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Vinta Chen
  • awesome-python
  • Merge requests
  • !1378

Add MLflow.

  • Review changes

  • Download
  • Email patches
  • Plain diff
Closed Administrator requested to merge github/fork/jmrr/master into master Oct 06, 2019
  • Overview 0
  • Commits 1
  • Pipelines 0
  • Changes 1

Created by: jmrr

What is this Python project?

MLflow is an open source platform to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models.

MLflow is the most comprehensive, platform agnostic project with the aims of encompassing, on a single platform, three main components of the ML lifecycle:

  • MLflow Tracking: An API to log parameters, code, and results in machine learning experiments and compare them using an interactive UI.

  • MLflow Projects: A code packaging format for reproducible runs using Conda and Docker, so you can share your ML code with others.

  • MLflow Models: A model packaging format and tools that let you easily deploy the same model (from any ML library) to batch and real-time scoring on platforms such as Docker, Apache Spark, Azure ML and AWS SageMaker.

What's the difference between this Python project and similar ones?

  • MLOps is still a domain in its early stages but some tools already exist based on the Kubernetes containerised ecosystem:
    • Kubeflow
    • Pachyderm
    • Polyaxon

The fact that they're based on Kubernetes appears to be somewhat of a barrier for small scale Data Science teams, whilst with MLflow an individual contributor can easily setup a single tracking server for their own experiments. They also tend to be more Deep Learning oriented. An advantage of Pachyderm is that it provides data reproducibility (apart from the code + model reproducibility provided by MLflow).

  • Sacred provides experimentation logging, but doesn't provide model packaging and sharing or the possibility of creating reproducible projects with your ML code for other people to use. Also you'd need a frontend (see next entry) to visualise and track your experiments, which is already provided by MLflow tracking server.

  • Ombniboard would only provide the frontend.

Some other nice tools exist but they're library specific, e.g. to track specific frameworks' simulations: TensorBoard and in the domain of model deployment TFX for TensorFlow.


Anyone who agrees with this pull request could vote for it by adding a 👍 to it, and usually, the maintainer will merge it when votes reach 20.

Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: github/fork/jmrr/master