Machine Learning and Big Data (TC309)

Machine Learning and Big Data (TC309)

Members 164
Technical Committee

TC Chair: Zhongqiang Liu

Short name: Machine Learning (TC309)

  • Introduction

Most geomaterials exhibit complex and uncertain behaviour due to the complex natural processes associated with their formation and deposition. As the result there are many problems in geotechnical engineering that challenge accurate solution and/or prediction with conventional analytical and numerical methods. This is particularly the case when there are contributing factors from in situ geology or characteristics of a specific phenomenon that do not necessarily have physical relationship to each other. In these circumstances machine learning (ML) and artificial intelligence (AI) based methods can have a significant impact in providing meaningful solutions. Early works on the application of different ML and AI methods in geotechnics dates back to early 90’s. However, in recent years, with pervasive developments in computer hardware and software, application of ML and AI in soil mechanics and geotechnical engineering have gained particular pace, and with growing interest in their application, new hopes and horizons have emerged. To this end, TC309 of the ISSMGE has been formed in order to coordinate, organize and direct ISSMGE members’ efforts in this field.


  • Task Forces

This task force category focused on the technical activities related to investigation and applications in the area of machine learning and big data in geoscience. Task force 1) and 2) are attempting to make the accessible benchmark datasets in geoscience and the acquistion of geotechnical big data . Task forces 3), 4), and 5) are mainly associated with the different stages of a geotechnical engineering project from site investigation and soil property evaluation, design, and construction. Task force 6) is to standardize ML/AI and build a robust approach/framework to support industry-wide confidence in application of AI and ML in geosciences.

      1) Development of benchmark datasets (Task leader: Wengang Zhang)

Members: Wengang Zhang, Zijun Cao (to be updated, not limited to TC309 members, other contributors are also welcome )

Introduction: Currently there are quite a lot of papers relevant with the machine learning (ML) algorithms or soft computing (SC) techniques published. Each claims that the proposed methods work better than other approaches. However, this conclusion is arrived based on the different dataset adopted. Consequently, the benchmark dataset, especially dataset of big data, used for model calibration and verification is essential. The main aims of this TF are: (1) to compile or collect benchmark dataset for ML or SC model calibration, and (2) to facilitate the future TC304-TC309 student contest, (3) test new methods developed for geotechnical data analytics.

Requirement on dataset: The dataset should satisfy one or more of the following requirements: (1) monitoring data over a few years, (2) extensive site investigations, (3) multi-variate big data, (4) use standard forms to store and present the data, like TC304dB.

      2) Big data acquisition (Task leader: Zili Li)

Members: Zili Li, Dimitrios Zekkos, Linqing Luo, Mehdi Alhaddad (to be updated, not yet TC309 member, other contributors are welcome)

Introduction: Machine learning, as one of the most active subsets of artificial intelligence, has been widely employed in many fields to automatically perform a specific task relying on patterns and inference without using explicit human instructions. However, the application of machine learning in geoengineering still significantly lags behind e-commerce, social network and many other fields. One major challenge in geoengineering is the lack of big data to train, test and validate machine learning models, as the acquisition of big geodata usually relies on manual site investigation, expensive lab testing, time-consuming field monitoring and etc.  

Aims and Goals: Recent advance in Information and Communication Technologies (ICTs) have developed a series of innovative field monitoring technologies, including distributed fibre optic sensing, wireless sensor network, autonomous robotic inspection, smartphone sensing and etc., which enable to acquire large amount of geodata at lower cost of labour and time than previously available. Nevertheless, the Technology Readiness Level (TRL) of many emerging geotechnical monitoring tools still largely remains at relatively early levels far before systemically wide application in a large scale. This task force of Big data acquisition aims to develop and improve novel monitoring tools for big geodata acquisition together with the associated geodata processing methods. The TF will provide an geoengineering platform for international researchers and professionals from different disciplines, sectors and countries, working collaboratively to break down barriers on the path of big geodata acquisition. 

This TF welcomes contributions to big geodata acquisitions in the following areas and beyond:

  • Development of innovative geotechnical monitoring technologies

  • Application of innovative field monitoring tools / methods to subsurface engineering

  • Gathering and processing of big data and metadata in geoengineering

  • Big geodata mining, generation of big geodata by computational methods and etc.

      3) Site investigation and geo-materials behavior (Task leader: Mohammad Rezania)

Members: Mohammad Rezania, Guotao Ma (to be updated, not yet TC309 member, other contributors are welcome)

Introduction: Geotechnical engineering is inherently a challenging discipline as soils are natural deposits, often with variation in their characteristics even within small area and short distances. In practice where relevant/sufficient data are not available, then very conservative design assumptions are used. During recent years advanced field investigation and accurate modelling of natural geomaterials have gained pace due to increasing demand in the industry for models that support less conservative designs. However, their utilization requires sufficient field monitoring or laboratory test data for model validation and calibration. There is enormous amount of data in the literature that are disjointed, fragmented and/or incoherent. To this end, the primary objectives of this TF are to: (1) collect, compile and classify the existing geotechnical monitoring and testing datasets from the literature and/or industrial records, and (2) provide a comprehensive and unified online database of geotechnical experimental resources for relevant ML model development/calibration to facilitate future practical and research applications.

      4) Design (Task leader: Bruno Stuyts)

Members: Bruno Stuyts

Introduction: Increasing amounts of digital geotechnical data are enabling researchers and practitioners to make use of data-driven methods for the design of foundations, slopes and underground structures. In order to develop machine learning pipelines for geotechnical design, a number of requirements are imposed on both the data (quality, structure, geospatial coverage, ...), the tools used for data processing and the foundation design algorithms.

Aims and scopes: In this task force, the necessary steps for enabling digital/automated geotechnical design will be studied and best practices will be suggested to allow end users to build data-driven workflows which rely on growing datasets. Special attention will be devoted to interpretable ML and feature engineering, to allow these workflows to capture engineering knowledge and underlying physical principles.


      5) Construction, Maintenance (Task leader: Dongming Zhang)

Members: Lisa Jinhui Li, Xu Li, Bin Liu, Kok-Kwang Phoon, Mingliang Zhou, Hongwei Huang, Huiming Wu and Jingya Yan

Introduction: The data from construction and maintenance of geotechnical engineering have the characteristics of continuum, heterogeneity and time-dependent. The analysis of those data could benefit the quality and safety of the construction and maintenance. Nowadays, as the sensors becomes smarter than ever before, the large amount of data from smart sensors are forcing the engineers to re-think the way to fully use such kind of new “oil” as huge information within these data could be dig out rationally and efficiently. As the machine learning develops in a rocket speed in other disciplines, how it will re-shape the construction and maintenance of the geotechnical engineering? This is the main job of this task force.

Aims and scopes:

1. Construction: The construction of geotechnical engineering is quite versatile both in terms of methodology and tools.  Quite often the mechanical based methodology such as TBM tunneling, shield machine tunneling, or even automatic driven of NATM tunneling could produce large amount of data. How to re-structure these data in a uniform pattern, to deeply analyze these data by machine learning algorithms, to instruct the construction parameters automatically in an unmanned manner, and finally to optimize the cost and enhance the safety of the geotechnical construction, should be the aim and the scope of this TF.

2. Maintenance: The monitoring and inspection for critical underground infrastructures such as metro systems, pipeline systems are one of important issues in maintenance of geo-structures and will produce large amount of time-dependent data of structural performance as well. How to automatically capture the disruption of monitored geo-infrastructures, to rationally determine the timing of the repair and maintenance work, and to enhance the resilience of the geo-infrastructures in a great service state, should be the aim and scope of this TF.

Things to kick-off recently:

-- Brainstorming of use of ML for construction and maintenance.

-- Setting up the criteria and formatting of the data collected from construction and maintenance

-- Choosing the most appropriate ML algorithms for specific data structures

-- Decision making for effective construction and maintenance

      6) Interact with industry, standards and guidelines (Task leader: Byron Quan Luna)

Introduction: Uses of artificial intelligence (AI), machine learning (ML) and other data­driven techniques have become increasingly widespread in recent years. Many now seek to capitalize on the potential such techniques offer to do things better, do things faster, and/or do things that were previously impossible. A data­driven model is a computational unit / program / function which makes predictions, and whose configuration is determined by a training operation on data. Data­driven application contains one or more data­driven models and uses the predictions of the model for some specific purpose.

Data­driven techniques are being used in a variety of applications including:

— early detection of failures (before they happen) and maintenance

— semi­ and fully automated technical verification

— prediction of unwanted events (geo-risks or other hazards incidents)

— automatic classification of maintenance logs and inspection findings

— detection of features (i.e. cracks)

The amount of responsibility humans are willing to hand over to any data­driven application depends on the criticality of the task it will perform, and the level of trust they have in the application. Most data­driven applications in use today have low criticality: their use is restricted either to low consequence scenarios, or to scenarios in which the application provides decision support for a human end user. But there is growing interest from vendors, consumers, industry and regulatory bodies in widening the scope in which data­driven applications can be used, to perform tasks with higher criticality and/or to move from decision support to decision taking.

However, difficulties remain in establishing trust that a data­driven application will operate as required, safely and reliably. The complexity of the data and the training algorithms, coupled with the lack of any standard approach to establishing trust in such applications, lead many to take a conservative approach and simply refuse to adopt such technologies until the field has matured.Even though it is possible to enable trust in data­driven applications through a systematic and data science­oriented consideration of risk.

To date no widely­recognized standard exists for assessment / assurance of data­driven geotechnical applications. In this context, the TF in TC309 will aim to fill that gap by combining the domain experience and inspection capabilities with digital analytics expertise, and also collaborating with end-users; to build a robust approach/framework to support industry-wide confidence in AI and ML.

  • Contact me

Chair:            Dr. Zhongqiang Liu, Norway, [email protected]

Vice-Chair:    Dr. Mohammad Rezania, UK, [email protected]

Secretary:      Dr. Dongming Zhang, China, [email protected]

  • Other Social Media Link

GeoWorld Link:

Linkedin Link:

Contact Information


Discussion Topic
Title Posted Description Replies Latest Reply
TC304/TC309 student contests 23/04/2021 11:57  Aug 7-9 2020, Chongqing University, China Participation form (download) Contest Question (… 0 N/A
TC304/TC309 student contests 23/04/2021 11:54 Oct 4-7 2020, Tokyo, Japan Organizer: Andy YF Leung, Zijun Cao, Lei Wang, Takayuki Shuku Award C… 0 N/A
TC304/TC309 student contests 23/04/2021 11:51 Sep 22 2019, Hannover, Germany Organizer: Giovanna Vessia, Wojciech Pula Award Committee: Jianye… 0 N/A
TC304/TC309 student contests 23/04/2021 11:43 Aug 18 2018, Harbin Institute of Technology, China Organizer: Dagang Lv, Hongwei Huang, Jie Zhang … 0 N/A
ISFOG2020 Data science prediction event 01/06/2019 13:51 In the run-up to the ISFOG 2020 conference (, I'm pleased to announce a community-driven predictio… 0 N/A
Greetings to all members of the ISSGME TC309 18/02/2019 18:35 Hi everybody, I am happy to be member this year 2019 of this TC309 committee. Hope we can all make advances in the ap… 0 N/A

Title Actions
Name Actions