The Fabulous Founder Myth of Venture Capital

There is a well-worn idea that almost every venture capital fund will tell you — “we seek out and find exceptional founders with solid business ideas and provide them with the support and backing to…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Deploying Apache Spark Supervised Machine Learning Models to Production with MLeap

Our architecture to host supervised machine learning models with MLeap benefits Kount’s external and internal customers by:

Since introducing Kount Boost Technology™ in October 2017, Kount has been researching improvements to the Boost supervised machine learning model leveraging distributed tools such as Apache Spark and HDFS. Consequently, we faced the challenging opportunity of architecting a technical solution to bring a Spark-generated machine learning model into our production environment.

Response Time

Using Python3 with Scikit-learn, the following response time statistics and accompanying distribution for generating transactional predictions were gathered from our production environment on 6/4/2018 from 12am until 12pm PT:

In the new world of utilizing Apache Spark on top of HDFS for building Boost’s supervised machine learning models complemented with MLeap to enable the production use of those models, we designed several proof of concept architectures and load tested them to verify that the model prediction response time metrics would (hopefully!) beat our first iteration response time metrics. The fastest results were promising! In fact, they were AMAZING:

This was a significant turning point in our confidence to deliver supervised machine learning model predictions to our customers faster than we’ve ever done it before.

Using MLeap, the following response time statistics and accompanying distribution for generating transactional predictions were gathered from our production environment on 7/30/18 from 12am until 12pm PT:

Without any critical path issues, the following architecture highlights:

Using Python3 with Scikit-learn, we opted for using joblib to pickle the Python3 object saving it to disk, making the supervised machine learning model portable. Joblib is great — as long as the system doing the exporting and the system doing the importing have the same version of joblib. This added to the complexity of upgrading Python3 module versions.

I got your attention, right? Well, even though MLeap is relatively new on the open source block (their first release was in September 2016), the resulting product is nothing short of impressive. The only thing I would change about MLeap is maybe make the pronunciation more obvious… is it MLeap? M’Leap? M-Leap? MLe-ap? I guess without it being documented, we’ll never know!

Noah Pritikin

Add a comment

Related posts:

Your Daily Itinerary If You Were an Evil Stepmother

Rise with the sun and put on your warm robe. Preferably one that belonged to your dead predecessor. Don’t forget slippers, it’s cold down in the cellar. Yet you must prod that stepdaughter awake. Not…