Investigating the effect of training–testing data stratification on the performance of soft computing techniques: an experimental study

Research output: Contribution to journalArticlepeer-review

43 Scopus citations

Abstract

Cross-validation of soft computing techniques needs to be done efficiently to avoid overfitting and underfitting. This is more important in petroleum reservoir characterisation applications where the often-limited training and testing data subsets represent Wells with known and unknown target properties, respectively. Existing data stratification strategies have been haphazardly chosen without any experimental basis. In this study, the optimal training–testing stratification proportions have been rigorously investigated using the prediction of porosity and permeability of petroleum reservoirs as an experimental case. The comparative performances of seven traditional and advanced machine learning techniques were considered. The overall results suggested a recommendable optimum training stratification that could serve as a good reference for researchers in similar applications.

Original languageEnglish
Pages (from-to)517-535
Number of pages19
JournalJournal of Experimental and Theoretical Artificial Intelligence
Volume29
Issue number3
DOIs
StatePublished - 4 May 2017

Bibliographical note

Publisher Copyright:
© 2016 Informa UK Limited, trading as Taylor & Francis Group.

Keywords

  • Stratification proportion
  • data-set division
  • permeability
  • porosity
  • soft computing

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Investigating the effect of training–testing data stratification on the performance of soft computing techniques: an experimental study'. Together they form a unique fingerprint.

Cite this