Deploying analytics with the portable format for analytics (PFA)

Jim Pivarski, Collin Bennett, Robert L. Grossman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

26 Scopus citations

Abstract

We introduce a new language for deploying analytic models into products, services and operational systems called the Portable Format for Analytics (PFA). PFA is an example of what is sometimes called a model interchange format, a language for describing analytic models that is independent of specific tools, applications or systems. Model interchange formats allow one application (the model producer) to export models and another application (the model consumer or scoring engine) to import models. The core idea behind PFA is to support the safe execution of statistical functions, mathematical functions, and machine learning algo- rithms and their compositions within a safe execution environment. With this approach, the common analytic models used in data science can be implemented, as well as the data transformations and data aggregations required for pre- and post-processing data. PFA compliant scoring engines can be extended by adding new user defined functions described in PFA. We describe the design of PFA. A Data Mining Group (DMG) Working Group is developing the PFA standard. The current version is 0.8.1 and contains many of the commonly used statistical and machine learning models, including regression, clustering, support vector machines, neural networks, etc. We also describe two implementations of Hadrian, one in Scala and one in Python. We discuss four case studies that use PFA and Hadrian to specify analytic models, including two that are deployed in operations at client sites.

Original languageEnglish (US)
Title of host publicationKDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages579-588
Number of pages10
ISBN (Electronic)9781450342322
DOIs
StatePublished - Aug 13 2016
Externally publishedYes
Event22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016 - San Francisco, United States
Duration: Aug 13 2016Aug 17 2016

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume13-17-August-2016

Other

Other22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016
Country/TerritoryUnited States
CitySan Francisco
Period8/13/168/17/16

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems

Keywords

  • Deploying analytics
  • Model producers
  • PFA
  • PMML
  • Portable Format for Analytics
  • Scoring engines

Fingerprint

Dive into the research topics of 'Deploying analytics with the portable format for analytics (PFA)'. Together they form a unique fingerprint.

Cite this