Multi-label prediction method for lithology, lithofacies and fluid classes based on data augmentation by cascade forest

Authors

  • Ruiyi Han College of Geo-Exploration Science and Technology, Jilin University, Changchun 130021, P. R. China
  • Zhuwen Wang College of Geo-Exploration Science and Technology, Jilin University, Changchun 130021, P. R. China
  • Yuhang Guo* College of Geo-Exploration Science and Technology, Jilin University, Changchun 130021, P. R. China(Email:guoyuhang100@jlu.edu.cn)
  • Xinru Wang College of Geo-Exploration Science and Technology, Jilin University, Changchun 130021, P. R. China
  • Ruhan A College of Geo-Exploration Science and Technology, Jilin University, Changchun 130021, P. R. China
  • Gaoming Zhong Northeast Oil and Gas Branch of Sinopec, Changchun 130000, P. R. China

Abstract

Predicting the lithology, lithofacies and reservoir fluid classes of igneous rocks holds significant value in the domains of CO2 storage and reservoir evaluation. However, no precedent exists for research on the multi-label identification of igneous rocks. This study proposes a multi-label data augmented cascade forest method for the prediction of multilabel lithology, lithofacies and fluid using 9 conventional logging data features of cores collected from the eastern depression of the Liaohe Basin in northeastern China. Data augmentation is performed on an unbalanced multi-label training set using the multi-label synthetic minority over-sampling technique. Sample training is achieved by a multi-label cascade forest consisting of predictive clustering trees. These cascade structures possess adaptive feature selection and layer growth mechanisms. Given the necessity to focus on all possible outcomes and the generalization ability of the method, a simulated well model is built and then compared with 6 typical multi-label learning methods. The outperformance of this method in the evaluation metrics validates its superiority in terms of accuracy and generalization ability. The consistency of the predicted results and geological data of actual wells verifies the reliability of our method. Furthermore, the results show that it can be used as a reliable means of multi-label prediction of igneous lithology, lithofacies and reservoir fluids.

Document Type: Original article

Cited as: Han, R., Wang, Z., Guo, Y., Wang, X., A, R., Zhong, G. Multi-label prediction method for lithology, lithofacies and fluid classes based on data augmentation by cascade forest. Advances in Geo-Energy Research, 2023, 9(1): 25-37. https://doi.org/10.46690/ager.2023.07.04

Keywords:

Data augmentation, deep learning, igneous rock, multi-label learning

References

Busby, C. J., Bassett, K. N. Volcanic facies architecture of an intra-arc strike-slip basin, Santa Rita Mountains, southern Arizona. Bulletin of Volcanology, 2007, 70(1): 85-103.

Cai, J., Zhao, L., Zhang, F., et al. Advances in multiscale rock physics for unconventional reservoirs. Advances in Geo-energy Research, 2022, 6(4): 271-275.

Chang, J., Li, J., Kang, Y., et al. SegLog: Geophysical logging segmentation network for lithofacies identification. IEEE Transactions on Industrial Informatics, 2021, 18(9): 6089-6099.

Charte, F., Rivera, A. J., del Jesus, M. J., et al. MLSMOTE: Approaching imbalanced multilabel learning through synthetic instance generation. Knowledge-Based Systems, 2015, 89: 385-397.

Chu, H., Dong, P., Lee, W. J. A deep-learning approach for reservoir evaluation for shale gas wells with complex fracture networks. Advances in Geo-Energy Research, 2023, 7(1): 49-65.

Duan, Y., Xie, J., Su, Y., et al. Application of the decision tree method to lithology identification of volcanic rocks-taking the Mesozoic in the Laizhouwan Sag as an example. Scientific Reports, 2020, 10(1): 19209.

Ehsan, M., Gu, H. An integrated approach for the identification of lithofacies and clay mineralogy through Neuro-Fuzzy, cross plot, and statistical analyses, from well log data. Journal of Earth System Science, 2020, 129(1): 101.

Falivene, O., Auchter, N. C., De Lima, R. P., et al. Lithofacies identification in cores using deep learning segmentation and the role of geoscientists: Turbidite deposits (Gulf of Mexico and North Sea). AAPG Bulletin, 2022, 106(7): 1357-1372.

Feng, Y., Bian, W., Gu, G., et al. A drilling data-constrained seismic mapping method for intermediate-mafic volcanic facies. Petroleum Exploration and Development, 2016, 43(2): 251-260.

Galar, M., Fernandez, A., Barrenechea, E., et al. An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and onevs-all schemes. Pattern Recognition, 2011a, 44(8): 1761-1776.

Galar, M., Fernandez, A., Barrenechea, E., et al. A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems Man and Cybernetics, Part C-Applications and Reviews, 2011b, 42(4): 463-484.

Giordano, G., Cas, R. A. F. Classification of ignimbrites and their eruptions. Earth-Science Reviews, 2021, 220: 103697.

He, M., Gu, H., Wan, H. Log interpretation for lithology and fluid identification using deep neural network combined with MAHAKIL in a tight sandstone reservoir. Journal of Petroleum Science and Engineering, 2020, 194: 107498.

Ho, M., Idgunji, S., Payne, J. L., et al. Hierarchical multi-label taxonomic classification of carbonate skeletal grains with deep learning. Sedimentary Geology, 2023, 443: 106298.

Huang, Y., Shan, J., Bian, W., et al. Facies classification and reservoir significance of the Cenozoic intermediate and mafic igneous rocks in Liaohe Depression, East China. Petroleum Exploration and Development, 2014, 41(6): 734-744.

Krzywinski, M., Schein, J., Birol, I., et al. Circos: An information aesthetic for comparative genomics. Genome Research, 2009, 19(9): 1639-1645.

Kuhn, S., Cracknell, M. J., Reading, A. M., et al. Identification of intrusive lithologies in volcanic terrains in British Columbia by machine learning using random forests: The value of using a soft classifier. Geophysics, 2020, 85(6): B249-B258.

Li, P., Chen, P., Zhang, D. Cross-modal feature representation learning and label graph mining in a residual multi-attentional CNN-LSTM network for multi-label aerial scene classification. Remote Sensing, 2022, 14(10): 2424.

Liu, Z., Wu, H., Chen, R. Evaluation of volcanic reservoir heterogeneity in eastern sag of Liaohe Basin based on electrical image logs. Journal of Petroleum Science and Engineering, 2022, 211: 110115.

Liu, B., Zhao, X., Fu, X., et al. Petrophysical characteristics and log identification of lacustrine shale lithofacies: A case study of the first member of Qingshankou Formation in the Songliao Basin, northeast China. Interpretation, 2020, 8(3): SL45-SL57.

Lopez, V., Fernandez, A., Garcia, S., et al. An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Information Sciences, 2013, 250: 113-141.

Wu, B., Ji, X., He, M., et al. Mineral identification based on multi-label image classification. Minerals, 2022, 12(11): 1338.

Wu, X., Zhou, Z. A unified view of multi-label performance measures. Paper Presented at 34th International Conference on Machine Learning, Sydney, Australia, 6-11 August, 2017.

Xiang, M., Qin, P., Zhang, F. Research and application of logging lithology identification for igneous reservoirs based on deep learning. Journal of Applied Geophysics, 2020, 173: 103929.

Xiao, L. The fusion of data driven machine learning with mechanism models and interpretability issues. Geophysical Prospecting for Petroleum, 2022, 61(2): 205-212. (in Chinese)

Yang, L., Wu, X., Jiang, Y., et al. Multi-label learning with deep forest. Paper FAIA200274 Presented at 24th European Conference on Artificial Intelligence, Santiago de Compostela, Spain, 29 August-8 September, 2020.

Yue, Q., Shan, X., Zhang, X., et al. Quantitative characterization, classification, and influencing factors of the full range of pores in weathering crust volcanic reservoirs: Case study in bohai bay basin, China. Natural Resources Research, 2021, 30(2): 1347-1365.

Yue, W., Tao, G. A new method for reservoir fluid identification. Applied Geophysics, 2006, 3(2): 124-129.

Zhang, L., Chen, L., Hu, R., et al. Subsurface multiphase reactive flow in geologic CO2 storage: Key impact factors and characterization approaches. Advances in Geo-energy Research, 2022, 6(3): 179-180.

Zhang, M., Li, Y., Liu, X., et al. Binary relevance for multilabel learning: An overview. Frontiers of Computer Science, 2018, 12(2): 191-202.

Zhang, L., Pan, B., Shan, G., et al. Method for identifying fluid property in volcanite reservoir. Oil Geophysical Prospecting, 2008, 43(6): 728-730. (in Chinese)

Zhang, M., Zhou, Z. ML-KNN: A lazy learning approach to multi-label learning. Pattern Recognition, 2007, 40(7): 2038-2048.

Zhang, M., Zhou, Z. A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 2013, 26(8): 1819-1837.

Zheng, Z., Zhang, L., Cheng, M., et al. Lithofacies logging identification for strongly heterogeneous deep-buried reservoirs based on improved Bayesian inversion: The Lower Jurassic sandstone, Central Junggar Basin, China. Frontiers in Earth Science, 2023, 11: 1095611.

Zhou, K., Zhang, J., Ren, Y., et al. A gradient boosting decision tree algorithm combining synthetic minority oversampling technique for lithology identification. Geophysics, 2020, 85(4): WA147-WA158.

Downloads

Download data is not yet available.

Downloads

Published

2023-07-09

How to Cite

Ruiyi Han, Zhuwen Wang, Guo*, Y., Xinru Wang, Ruhan A, & Gaoming Zhong. (2023). Multi-label prediction method for lithology, lithofacies and fluid classes based on data augmentation by cascade forest. Advances in Geo-Energy Research, 9(1), 25–37. Retrieved from https://ager.yandypress.com/index.php/2207-9963/article/view/286

Issue

Section

Articles