Toggle Main Menu Toggle Search

Open Access padlockePrints

GPTIPS 2: an open-source software platform for symbolic data mining

Lookup NU author(s): Dr Dominic Searson


Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


GPTIPS is a free, open source MATLAB based software platform for symbolic data mining (SDM). It uses a multigene variant of the biologically inspired ma-chine learning method of genetic programming (MGGP) as the engine that drives the automatic model discovery process. Symbolic data mining is the process of extracting hidden, meaningful relationships from data in the form of symbolic equations. In contrast to other data-mining methods, the structural transparency of the generated predictive equations can give new insights into the physical systems or processes that generated the data. Furthermore, this transparency makes the models very easy to deploy outside of MATLAB.The rationale behind GPTIPS is to reduce the technical barriers to using, under-standing, visualising and deploying GP based symbolic models of data, whilst at the same time remaining highly customisable and delivering robust numerical performance for power users. In this chapter, notable new features of the latest version of the software - GPTIPS 2 - are discussed with these aims in mind. Additionally, a simplified variant of the MGGP high level gene crossover mechanism is proposed.It is demonstrated that the new functionality of GPTIPS 2 (a) facilitates the discovery of compact symbolic relationships from data using multiple approaches, e.g. using novel gene-centric visualisation analysis to mitigate horizontal bloat and reduce complexity in multigene symbolic regression models (b) provides numerous methods for visualising the properties of symbolic models (c) emphasises the generation of graphically navigable libraries of models that are optimal in terms of the Pareto trade off surface of model performance and complexity and (d) expedites real world applications by the simple, rapid and robust deployment of symbolic models outside the software environment they were developed in.

Publication metadata

Author(s): Searson DP

Editor(s): Gandomi, AH; Alavi, AH; Ryan, C;

Publication type: Book Chapter

Publication status: In Press

Book Title: Springer Handbook of Genetic Programming Applications

Year: 2015

Acceptance date: 04/02/2015

Publisher: Springer


Link to this publication