img

QQ群聊

img

官方微信

高级检索

黄金科学技术 ›› 2024, Vol. 32 ›› Issue (3): 539-547.doi: 10.11872/j.issn.1005-2518.2024.03.040

• 采选技术与矿山管理 • 上一篇    下一篇

融合全监督学习的半监督矿石粒度预测算法

姜志宏1,2(),陈澳1()   

  1. 1.江西理工大学机电工程学院,江西 赣州 341000
    2.江西省矿冶机电工程技术研究中心,江西 赣州 341000
  • 收稿日期:2024-01-31 修回日期:2024-04-11 出版日期:2024-06-30 发布日期:2024-07-05
  • 通讯作者: 陈澳 E-mail:jzhee_mail@163.com;1012558903@qq.com
  • 作者简介:姜志宏(1977-),男,江苏江都人,副教授,从事矿山智能装备技术与应用研究工作。jzhee_mail@163.com
  • 基金资助:
    国家自然科学基金项目“多点对称超声载荷作用下包裹性矿物界面损伤演化及解离机理研究”(52364025)

Semi-supervised Ore Granularity Prediction Algorithm Incorporating Fully Supervised Learning

Zhihong JIANG1,2(),Ao CHEN1()   

  1. 1.Faculty of Mechatronic Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, Jiangxi, China
    2.Jiangxi Mining and Metallurgy Electromechanical Engineering Technology Research Center, Ganzhou 341000, Jiangxi, China
  • Received:2024-01-31 Revised:2024-04-11 Online:2024-06-30 Published:2024-07-05
  • Contact: Ao CHEN E-mail:jzhee_mail@163.com;1012558903@qq.com

摘要:

针对选矿过程矿石粒度分析精度的提高依赖于有标签样本数量,以及传统全监督建模方法泛化性能较差的问题,提出了融合全监督学习的半监督矿石粒度预测算法。以运矿皮带上应用图像获取的矿石粒度数据作为研究对象,利用半监督学习获得无标签的图像识别矿石粒度样本伪标签,扩展数量有限的原始标签样本,以提高矿石粒度预测模型的性能。采用筛分法获取的矿石粒度数据集来验证融合全监督学习的半监督预测算法,结果表明,融合全监督学习的半监督预测算法的模型决定系数达到92.1%,均方根误差和平均绝对误差分别为0.023和0.02,相较于传统全监督建模方法,该模型的预测精度显著提高,为提高矿石粒度检测精度提供了有力的技术支撑。

关键词: 半监督学习, 粒度检测, 伪标签, 粒度分布, 机器学习, 矿石

Abstract:

Aiming at the problems that the improvement of the accuracy of ore particle size analysis in the ore dressing process depends on the number of labeled samples,and the application of the traditional fully supervised modeling method has poor generalization performance,a semi-supervised ore particle size prediction algorithm incorporating fully supervised learning was proposed.Taking the ore particle size data obtained by applying images on the ore transport belt as the research object,the ore particle size data was analyzed.Four kinds of ore particle size features namely,particle size,weighted arithmetic mean size,standard deviation and deviation coefficient was adopted as the input features.And three kinds of prediction models were established,namely,decision tree,GBDT and BP neural network.By stratified sampling of the original ore size labeled samples,a training set was constructed.Then use the semi-supervised learning to obtain the unlabeled image identification ore particle size samples pseudo-labels,screen out high-confidence pseudo-labeled samples,add the pseudo-labels judged by confidence to the original ore particle size label samples,expand the limited number of original labeled samples,and at the same time delete the corresponding samples in the unlabeled ore particle size samples.Finally,in order to improve the performance of the prediction mode,a new regression prediction model was constructed based on the expanded set of original labeled samples,.The ore particle size dataset obtained by sieving method was used to validate the semi-supervised prediction algorithm incorporating fully supervised learning.The results show that,compared with the traditional fully supervised modeling methods such as decision tree,ridge regression,Bayesian,etc.The model coefficient of determination of the semi-supervised prediction algorithm incorporating fully supervised learning reaches 92.1%,which is increased by 5%,5.4%,and 5.2%,respectively.The root-mean-square error is 0.023,which is reduced by 23.33%,23.33% and 20.69%,respectively,and the mean absolute error is 0.02,which is reduced by 23.08%,13.04% and 9.09%,respectively.The research shows that the prediction accuracy is significantly improved,which verifies the feasibility and reliability of the semi-supervised ore particle size prediction model incorporating fully supervised learning.It also provides a powerful technological support for the improvement of the accuracy of ore particle size detection,and further confirms the advantages of the semi-supervised learning,and provides a powerful technological support for the improvement of the accuracy of the semi-supervised learning.It further confirms the advantages of semi-supervised learning,provides new ideas and methods for the practical application of ore particle size prediction technology,and is expected to improve the production efficiency and quality control level in the process of ore processing and utilization.

Key words: semi-supervised learning, granularity detection, pseudolabeling, particle size distribution, machine learning, ore

中图分类号: 

  • TF4

表1

矿石粒度数据集数据结构(部分数据)"

样本数据类型

矿石粒级

/mm

图像识别

矿石粒度

分布/%

加权算术平均粒度

/mm

标准差

偏差

系数

人工筛分

矿石粒度

分布/%

有标签样本

数据

+100.3223.2451.5860.2440.290
+5~100.3152.4341.1900.1830.285
-50.3630.8110.3970.0610.425
+100.5014.0191.2670.1580.474
+5~100.3563.0140.9500.1180.312
-50.1431.0050.3170.0390.214

无标签样本

数据

+100.3593.4211.5480.226
+5~100.3302.5661.1610.170
-50.3110.8550.3870.057
+100.3253.2861.5820.241
+5~100.3182.4651.1870.181
-50.3570.8220.3960.060
+100.3663.4681.5040.221
+5~100.3382.6011.1480.166
-50.2960.8670.3830.055

表2

各模型的最佳超参数及预测误差"

模型最佳超参数RMSEMAE
Treemax_depth = 40.0300.026
RFn_estimators = 200.0400.027
GBDT

learning_rate = 0.01

max_depth = 5

n_estimators = 200

0.0260.022
XGBoost

gamma = 0.0

max_depth = 4

min_child_weight = 4

n_estimators = 70

0.0400.028
Bayes

alpha_1= 1e-08

alpha_2= 1e-06

lambda_1= 1e-06

lambda_2= 1e-08

n_iter= 100

0.0290.022
多项式回归

Linearregression_fit

_intercept= True

polynomialfeatures_degree2

0.0800.051
岭回归

Alpha = 0.3

gamma = 0.1

kernel = linear

0.0300.023
SVM

C = 1.0

Gamma = 1.0

Kernel = linear

0.0750.063
BP神经网络

Activation = relu

Alpha = 0.0001

hidden_layer_sizes = (100,)

0.0300.023

表3

5种预测模型在实际矿石数据集上的决定系数"

模型R2
Tree0.871
GBDT0.899
Bayes0.869
岭回归0.867
BP神经网络0.872

图1

半监督算法流程图"

图2

神经元结构模型"

表4

高置信度伪标签样本数据(部分数据)"

矿石粒级

/mm

图像识别矿石粒度分布/%加权算术平均粒度/mm标准差

偏差

系数

模型预测矿石粒度分布/%
+5~100.332.5661.1610.1700.318
-50.3570.8220.3960.0600.377
+100.3663.4681.5040.2210.285
-50.2960.8670.3830.0550.393
+100.3113.1671.6060.2540.297
-50.3890.7920.4010.0630.279
+100.3053.1561.6000.2540.302
+5~100.3052.3671.2000.1900.416
+5~100.3212.5151.1760.1750.350
-50.3320.8380.3920.0590.318
+100.3663.4681.5300.2210.386
+5~100.3402.6011.1480.1660.280
-50.3600.8030.3880.0600.324
+100.3103.3631.5070.2240.402
+5~100.3802.5221.1300.1680.282
-50.3100.8410.3770.0560.305
+100.3103.3131.5350.2320.415
-50.2500.9030.3670.0580.325
+5~100.3802.6061.1140.1600.396
+5~100.3902.5691.1140.1630.325
+100.3003.1501.5960.2530.388
-50.3900.7880.3990.0630.282

图3

8种预测模型的测试结果对比"

表5

8种预测模型的评价指标对比"

模型评价指标
RMSEMAER2
BP神经网络0.0310.0270.862
半监督学习模型0.0230.0200.921
GBDT0.0300.0240.871
RF0.0410.0330.756
SVM0.0460.0340.692
决策树0.0290.0250.882
XGBOOST0.0300.0250.872
贝叶斯0.0290.0220.879

表6

不同组合预测模型测试性能对比"

模型评价指标
RMSEMAER2
半监督+决策树0.0310.0270.862
半监督+BP0.0230.0200.921
半监督+GBDT0.0300.0240.871
半监督+RF0.0410.0330.756
半监督+XBGoost0.0460.0340.692
半监督+贝叶斯0.0290.0250.882
半监督+岭回归0.0300.0250.872
Chun F, Liu X M, Zhu X G,et al,2021.Numerical study on crushing law of iron ore under different impact velocity using CDEM[J].IOP Conference Series:Earth and Environmental Science,861(4):042069.DOI:10.1088/1755-1315/861/4/042069 .
doi: 10.1088/1755-1315/861/4/042069
Hu S D, Miao D Q,PedryczW,2022.Multi granularity based label propagation with active learning for semi-supervised classification[J].Expert Systems with Applications,192:116276..
Hu Yunqing, Qiu Qingying, Yu Xiu,et al,2020.Semi-supervised patent text classification method based on improved Tri-training algorithm[J].Journal of Zhejiang University (Engineering Science),54(2):331-339.
Hu Z J, Yang Z Y, Hu X F,et al,2021.SimPLE:Similar pseudo label exploitation for semi-supervised classification[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),Nashville,TN,USA,15094-15103.DOI:10.1109/CVPR46437.2021.01485 .
doi: 10.1109/CVPR46437.2021.01485
Huang Faming, Pan Lihan, Yao Chi,et al,2021.Landslide susceptibility prediction modelling based on semi-supervised machine learning [J].Journal of Zhejiang University(Engineering Science),55(9):1705-1713.
Kang P, Kim D, Cho S,2016.Semi-supervised support vector regression based on self-training with label uncertainty:An application to virtual metrology in semiconductor manufacturing[J].Expert Systems with Applications,51:85-106.DOI:10.1016/j.eswa.2015.12.027 .
doi: 10.1016/j.eswa.2015.12.027
Lan Fengchong, Pan Wei, Chen Jiqing,2023.Prediction of remaining useful life of real-world vehicle lithium-ion power battery based on Aseq2seq-PF[J].Automotive Engineering,45(12):2348-2356.
Li Hongxiang, Wang Xiaoli, Yang Chunhua,et al,2021.Ore image segmentation method based on GAN-UNet[J].Control Theory and Applications, 38(9):1393-1398.
Liu S S, Li S J,2022.A semi-supervised soft sensor method based on vine copula regression and tri-training algorithm for complex chemical processes[J].Journal of Process Control,120:115-128.
Lu Jiawang,2023.Study on Complex Particle Swarm Space Image Reconstruction and Screening Performance Evaluation in Screening Process[D].Beijing:China University of Mining and Technology(Beijing).
Mao Gengxuan, Tu Yan, Cui Wenbo,et al,2022.Hyperspectral inve-rsion of soil heavy metal mass concentration based on semi-supervised regression[J].Journal of Applied Sciences—Electronics and Information Engineering,40(6):941-952.
Shi Xudong, Xiong Weili,2020.Semi-supervised gaussian process regression modeling based on improved self-training algorithm[J].Control Engineering of China,27(3):451-455.
Song Jian, Wang Wenlong, Li Dong,et al,2022.Injection molded part size prediction method based on stacking integrated learning[J].Journal of South China University of Technology (Natural Science Edition),50(6):19-26.
Sun Xingwei, Yang Tongtong, Yang Heran,et al,2023.Research on thermal error of CNC machine tool feed system based on CNN-GRU combined neural network[J].Chinese Journal of Scientific Instrumen,44(10):219-226.
Wang Jianming, Ye Yurong, Rao Chaomin,et al,2023.Prediction on composite interface bonding strength between ceramsite lightweight aggregate concrete and normal concrete based on GBDT algorithm[J].Journal of Building Materials,26(2):150-155,171.
Wang Wei, Li Qing, Zhang Dezheng, al el,2023.A survey of ore image processing based on deep learning[J].Chinese Journal of Engineering,45(4):621-631.
Xu Wen, Tang Jian, Xia Heng,et al,2022.Soft sensor of dioxin emission concentration based on Bagging semi-supervised deep forest regression [J].Chinese Journal of Scientific Instrument,43(6):251-259.
Xu Yongyang, Li Zixuan, Xie Zhong,et al,2020.Prediction of copper mineralization based on semi-supervised neural network [J].Earth Science,45(12):4563-4573.
Zhang G H, Fan Y B, Yang R S,et al,2023.Influence of ore size on the production of micro-sized ore particles by high-pressure gas rapid unloading[J].Powder Technology,427(5):118716.DOI:10.1016/j.powtec.2023.118716 .
doi: 10.1016/j.powtec.2023.118716
Zhuang Huimin,2022.Research on Semi-Supervised Learning Based ARP Attack Detection Method in SDIIoT[D].Shanghai:Donghua University.
胡云青,邱清盈,余秀,等,2020.基于改进三体训练法的半监督专利文本分类方法[J].浙江大学学报(工学版),54(2):331-339.
黄发明,潘李含,姚池,等,2021.基于半监督机器学习的滑坡易发性预测建模[J].浙江大学学报(工学版),55(9):1705-1713.
兰凤崇,潘威,陈吉清,2023.基于Aseq2seq-PF的实车锂离子动力电池剩余使用寿命预测[J].汽车工程,45(12):2348-2356.
李鸿翔,王晓丽,阳春华,等,2021.基于GAN-UNet的矿石图像分割方法[J].控制理论与应用,38(9):1393-1398.
卢佳旺,2023.筛分过程复杂粒群空间图像重构与筛分效果评价研究[D].北京:中国矿业大学(北京).
毛耿旋,涂彦,崔文博,等,2022.基于半监督回归的高光谱土壤重金属质量浓度反演[J].应用科学学报,40(6):941-952.
史旭东,熊伟丽,2020.基于改进自训练算法的半监督GPR软测量建模[J].控制工程,27(3):451-455.
宋建,王文龙,李东,等,2022.基于Stacking集成学习的注塑件尺寸预测方法[J].华南理工大学学报(自然科学版),50(6):19-26.
孙兴伟,杨铜铜,杨赫然,等,2023.基于CNN-GRU组合神经网络的数控机床进给系统热误差研究[J].仪器仪表学报,44(10):219-226.
王建民,叶钰蓉,饶超敏,等,2023.基于GBDT算法的混凝土叠合面黏结强度预测分析[J].建筑材料学报,26(2):150-155,171.
王伟,李擎,张德政,等,2023.基于深度学习的矿石图像处理研究综述[J].工程科学学报,45(4):621-631.
徐雯,汤健,夏恒,等,2022.基于Bagging半监督深度森林回归的二噁英排放浓度软测量[J].仪器仪表学报,43(6):251-259.
徐永洋,李孜轩,谢忠,等,2020.基于半监督神经网络的铜矿预测方法[J].地球科学,45(12):4563-4573.
庄慧敏,2022.软件定义工业物联网下基于半监督学习的ARP攻击检测方法研究[D].上海:东华大学.
[1] 凡兴禹, 王雪林. 基于改进XGBoost算法的深部巷道松动圈智能预测研究[J]. 黄金科学技术, 2024, 32(1): 109-122.
[2] 吴荔, 匡文龙, 张志辉, 陈健龙, 张跃权, 刘兆阳, 黄英剑. 赣东北银山矿田铜矿石中伴生金赋存状态研究[J]. 黄金科学技术, 2023, 31(6): 888-899.
[3] 许方颖, 邹艳红, 易卓炜, 杨福强, 毛先成. 基于非均衡数据的ADASYN-CatBoost测井岩性智能识别——以胶西北招贤金矿床为例[J]. 黄金科学技术, 2023, 31(5): 721-735.
[4] 陈建宏,赵亚坤,杨珊,钟旭东. 基于蚁群—蚁周模型的大均化联合配矿及生产数据集成共享系统研究[J]. 黄金科学技术, 2023, 31(2): 292-301.
[5] 汤文聪,罗小燕. 基于FCM-WA联合算法的多种类矿石图像分割[J]. 黄金科学技术, 2023, 31(1): 153-162.
[6] 许可, 许德如. 江南造山带黄金洞金矿蚀变岩型金矿化形成机制研究[J]. 黄金科学技术, 2022, 30(2): 151-164.
[7] 韩梓晴,李孜军,徐圆圆. 基于偏序集的硫化矿石自燃倾向性评价[J]. 黄金科学技术, 2022, 30(1): 105-112.
[8] 胡建华,郭萌萌,周坦,张涛. 基于改进迁移学习算法的岩体质量评价模型[J]. 黄金科学技术, 2021, 29(6): 826-833.
[9] 郭彩莲,牛芳银,陈炳龙,王海军. 老挝巴勉石英脉型金矿石工艺矿物学研究[J]. 黄金科学技术, 2021, 29(6): 908-916.
[10] 张淦,陶干强,吴宇轩. 基于改流体放矿技术的放矿口尺寸试验研究[J]. 黄金科学技术, 2021, 29(3): 364-371.
[11] 陈鑫,王李管,李金玲. 矿石堆场品位模型构建及取料品位估算方法[J]. 黄金科学技术, 2021, 29(2): 287-295.
[12] 田睿,孟海东,陈世江,王创业,孙德宁,石磊. 基于机器学习的3种岩爆烈度分级预测模型对比研究[J]. 黄金科学技术, 2020, 28(6): 920-929.
[13] 王牧帆,罗周全,于琦. 基于 Stacking 模型的采空区稳定性预测[J]. 黄金科学技术, 2020, 28(6): 894-901.
[14] 廖智勤, 王李管, 何正祥. 基于EEMD和关联维数的矿山微震信号特征提取和分类[J]. 黄金科学技术, 2020, 28(4): 585-594.
[15] 张克川,罗清威,义爱文,秦德雨,权成,杨继兵. 坦桑尼亚PL7184金矿床矿石特征与金的赋存状态[J]. 黄金科学技术, 2019, 27(6): 826-834.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 闫杰, 覃泽礼, 谢文兵, 蔡邦永. 青海南戈滩—乌龙滩地区多金属地质特征与找矿潜力[J]. J4, 2010, 18(4): 22 -26 .
[2] 宋贺民, 冯喜利, 丁宪华. 太行山北段交界口矿区地质地球化学特征及找矿方向[J]. J4, 2010, 18(3): 54 -58 .
[3] 李淑芳, 于永安, 朝银银, 王美娟, 张岱, 刘君, 孙亮亮. 在辽东成矿带找寻层控型金矿床靶区[J]. J4, 2010, 18(3): 59 -62 .
[4] 胡琴霞, 李建忠, 喻光明, 谢艳芳, 张圣潇. 白龙江成矿带金矿点初探[J]. J4, 2010, 18(3): 51 -53 .
[5] 陈学俊. 青海直亥买休玛金矿床矿体特征与找矿前景分析[J]. J4, 2010, 18(4): 50 -53 .
[6] 崔廷军, 逯克思, 庄勇, 傅星. 青海省柴达木盆地南缘金成矿带特征及成矿规律浅析[J]. J4, 2010, 18(3): 63 -67 .
[7] 杨明荣, 牟长贤. 原子荧光法测定化探样品中砷和锑的不确定度评定[J]. J4, 2010, 18(3): 68 -71 .
[8] 苏建华, 陆树林. 从高酸低浓度尾液中萃取金的试验[J]. J4, 2010, 18(3): 72 -75 .
[9] 王大平, 宋丙剑, 韦库明. 大功率激电测量在辽宁北水泉寻找隐伏矿床的应用[J]. J4, 2010, 18(3): 76 -78 .
[10] 刘胜光, 高海峰, 黄锁英. 电子手薄在山东焦家金矿地质专业中的应用[J]. J4, 2010, 18(3): 79 -82 .