Shaman provides a complete machine learning framework for knowledge worker automation and robotics. Shaman simplifies the path from a machine learning opportunity to a robust and integrated production system. It should be particularly useful to teams looking to integrate machine learning into their processes for the first time.

Shaman is principally:

  • A composable selection of open source components organized into a micro-services framework (see below)
  • A user interface generator for labeling data, training, validation and production
  • Data generators for many types of data (documents, web pages, images, ERP data, CRM data, etc.)
  • Data extractors for many types of documents and web content
  • Connectors to xml and json web services, email and machine learning services

Out of the box Shaman can be deployed to:

  • categorize emails, pdf, word and excel documents, webpage, images and data from web services
  • extract data from emails, documents, web pages, images, video and sound
  • extract sentiment from text and emotions from images
  • explore and solve problems in a simulated environment (3D, 2D, textual or numerical)
  • answer textual questions using information in your ERP or CRM
  • provide ratings and recommendations
  • connect to cloud services such as voice recognition and generation
  • compose multi-stage processes (e.g. image classification -> image quality scoring -> region detection -> OCR)
  • Linux or Windows servers; RaspberryPi 3 or Nvidia Jetson TX2; Mac OSX, Windows or Linux workstations

The first public release of Shaman will be available in summer 2018 under an open source licence (probably Apache or GPL3)

Components