Scala for Machine Learning(Second Edition)
上QQ阅读APP看书,第一时间看更新

Summary

This completes the overview of the most commonly used data filtering or smoothing techniques. There are other types of data preprocessing algorithms such as normalization, analysis and reduction of variance, and identification of missing values that are also essential to avoid the garbage-in garbage-out conundrum that plagues so many projects that use machine learning for regression or classification.

Scala can be effectively used to make the code understandable and avoid cluttering methods with unnecessary arguments.

The three techniques presented in this chapter, from the simplest moving averages and Fourier transform to the more elaborate Kalman filter, go a long way in setting up data for the next step introduced in the next chapter: unsupervised learning and, more specifically, clustering.