Machine learning and data mining tasks in big data involve different nature of inputs that typically exhibit high dimensionality, e.g. more than 1,000 features, far from current acceptable scales computing in one machine. In many different domains, data have highly nonlinear representations that nature-inspired models can easily capture, outperforming simple models. But, the usage of these approaches in high-dimensional data are computationally costly. Recently, artificial hydrocarbon networks (AHN)—a supervised learning method inspired on organic chemical structures and mechanisms—have shown improvements in predictive power and interpretability in contrast with other well-known machine learning models, such as neural networks and random forests. However, AHN are very time-consuming that are not able to deal with big data until now. In this chapter, we present a fast and reliable nature-inspired training method for AHN, so they can handle high-dimensional data. This training method comprises a population-based meta-heuristic optimization with defined both individual encoding and objective function related to the AHN-model, and it is also implemented in parallel-computing. After benchmark performing of population-based optimization methods, grey wolf optimization (GWO) was selected. Our results demonstrate that the proposed hybrid GWO-based training method for AHN runs more than 1400x faster in high-dimensional data, without loss of predictability, yielding a fast and reliable nature-inspired machine learning model. We also present a use case in assisted living monitoring, i.e. human fall classification comprising 1,269 features from sensor signals and video recordings, with this proposed training algorithm to show its implementation and performance. We anticipate our new training algorithm to be useful in many applications like medical engineering, robotics, finance, aerospace, and others, in which big data is essential.

Development of fast and reliable nature-inspired computing for supervised learning in high-dimensional data

de Campos Souza, Paulo Vitor
2020-01-01

Abstract

Machine learning and data mining tasks in big data involve different nature of inputs that typically exhibit high dimensionality, e.g. more than 1,000 features, far from current acceptable scales computing in one machine. In many different domains, data have highly nonlinear representations that nature-inspired models can easily capture, outperforming simple models. But, the usage of these approaches in high-dimensional data are computationally costly. Recently, artificial hydrocarbon networks (AHN)—a supervised learning method inspired on organic chemical structures and mechanisms—have shown improvements in predictive power and interpretability in contrast with other well-known machine learning models, such as neural networks and random forests. However, AHN are very time-consuming that are not able to deal with big data until now. In this chapter, we present a fast and reliable nature-inspired training method for AHN, so they can handle high-dimensional data. This training method comprises a population-based meta-heuristic optimization with defined both individual encoding and objective function related to the AHN-model, and it is also implemented in parallel-computing. After benchmark performing of population-based optimization methods, grey wolf optimization (GWO) was selected. Our results demonstrate that the proposed hybrid GWO-based training method for AHN runs more than 1400x faster in high-dimensional data, without loss of predictability, yielding a fast and reliable nature-inspired machine learning model. We also present a use case in assisted living monitoring, i.e. human fall classification comprising 1,269 features from sensor signals and video recordings, with this proposed training algorithm to show its implementation and performance. We anticipate our new training algorithm to be useful in many applications like medical engineering, robotics, finance, aerospace, and others, in which big data is essential.
2020
978-3-030-33819-0
978-3-030-33820-6
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/341071
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact