Scala for Machine Learning(Second Edition)
上QQ阅读APP看书,第一时间看更新

Summary

In this chapter, we established the framework for the different data processing units that will be introduced in this book. There is a very good reason why the topics of model validation and overfitting are treated early on in this book: there is no point in building models and selecting algorithms if we do not have a methodology to evaluate their relative merits.

In this chapter, you were introduced to the following topics:

  • The concept of monadic transformation for implicit and explicit models
  • The versatility and cleanness of the cake pattern and mixin composition in Scala as an effective scaffolding tool for data processing
  • A robust methodology to validate machine learning models
  • The challenge in fitting models to both training and real-world data

The next chapter will address the problem of overfitting by identifying outliers and reducing noise in data.