TY - JOUR
T1 - Enhanced conditional Co-Gibbs sampling algorithm for data imputation
AU - Madani, Nasser
AU - Bazarbekov, Talgatbek
N1 - Funding Information:
The authors are grateful to Nazarbayev University for funding this work via the “Faculty Development Competitive Research Grants” for 2018–2020 under Contract No. 090118FD5336 . We are also grateful to the Editorial team and three anonymous reviewers for their valuable comments, which substantially helped improving the final version of the manuscript.
Publisher Copyright:
© 2020 Elsevier Ltd
PY - 2021/3
Y1 - 2021/3
N2 - The Gibbs sampler is an iterative algorithm for data imputation of a random vector at locations where values of the variable of interest are missing. In this algorithm, the simulated values converge to a Gaussian random vector distribution with zero mean and a given covariance matrix obtained by solving a simple kriging system through several iterations. In a bivariate dataset, if the principal variable for imputation depends on an auxiliary variable that is more abundant at the sample locations, this algorithm fails to produce the local and spatial cross-correlation structures. To overcome this impediment, a variant of the Gibbs sampler, the conditional Co-Gibbs sampler, has been proposed in this study, where simple kriging is replaced by three alternative cokriging paradigms: multicollocated cokriging, collocated cokriging, and homotopic cokriging. The algorithm was examined for an actual case study to statistically evaluate its performance. The results indicate that the conditional Co-Gibbs sampler with multicollocated cokriging outperformed the alternatives, including simple kriging where data imputation occurred as a consequence of ignoring the influence of the auxiliary variable, partially or totally. In addition, a computer software, provided as an open-source executable file, was used to implement the proposed algorithm for data imputation in bivariate cases.
AB - The Gibbs sampler is an iterative algorithm for data imputation of a random vector at locations where values of the variable of interest are missing. In this algorithm, the simulated values converge to a Gaussian random vector distribution with zero mean and a given covariance matrix obtained by solving a simple kriging system through several iterations. In a bivariate dataset, if the principal variable for imputation depends on an auxiliary variable that is more abundant at the sample locations, this algorithm fails to produce the local and spatial cross-correlation structures. To overcome this impediment, a variant of the Gibbs sampler, the conditional Co-Gibbs sampler, has been proposed in this study, where simple kriging is replaced by three alternative cokriging paradigms: multicollocated cokriging, collocated cokriging, and homotopic cokriging. The algorithm was examined for an actual case study to statistically evaluate its performance. The results indicate that the conditional Co-Gibbs sampler with multicollocated cokriging outperformed the alternatives, including simple kriging where data imputation occurred as a consequence of ignoring the influence of the auxiliary variable, partially or totally. In addition, a computer software, provided as an open-source executable file, was used to implement the proposed algorithm for data imputation in bivariate cases.
KW - Algorithms
KW - Data processing
KW - Geology
KW - Geostatistics
KW - Spatial statistics
UR - http://www.scopus.com/inward/record.url?scp=85100673862&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85100673862&partnerID=8YFLogxK
U2 - 10.1016/j.cageo.2020.104655
DO - 10.1016/j.cageo.2020.104655
M3 - Article
AN - SCOPUS:85100673862
SN - 0098-3004
VL - 148
JO - Computers and Geosciences
JF - Computers and Geosciences
M1 - 104655
ER -