The development of machine learning sheds new light on the traditionally complicated problem of thermodynamics in multicomponent alloys. Successful application of such a method, however, strongly depends on the quality of the data and model. Here we propose a scheme to improve the representativeness of the data by utilizing the short-range order (SRO) parameters to survey the configuration space. Using the improved data, a pair interaction model is trained for the NbMoTaW high entropy alloy using linear regression. Benefiting from the physics incorporated into the model, the learned effective Hamiltonian demonstrates excellent predictability over the whole configuration space. By including pair interactions within the 6th nearest-neighbor shell, this model achieves an R2 testing score of 0.997 and root mean square error of 0.43 meV. We further perform a detailed analysis on the effects of training data, testing data, and model parameters. The results reveal the vital importance of representative data and physical models. On the other hand, we also examined the performance neural networks, which is found to demonstrate a strong tendency to overfit the data.
- Jiaxin Zhang*, Xianglin Liu, Sirui Bi, Junqi Yin, Guannan Zhang, and Markus Eisenbach. "Robust data-driven approach for predicting the configurational energy of high entropy alloys." Materials & Design 185 (2020): 108247.
- Xianglin Liu, Jiaxin Zhang*, Markus Eisenbach, and Yang Wang. "Machine learning modeling of high entropy alloy: the role of short-range order." arXiv preprint arXiv:1906.02889 (2019). Submitted
- Xianglin Liu, Jiaxin Zhang*, Sirui Bi, Yang Wang, G. Malcolm Stocks, and Markus Eisenbach. "Chemical complexity in high entropy alloys: A pair-interaction perspective." arXiv preprint arXiv:1907.10223 (2019). Submitted
Significance and Impact
A data-driven framework is proposed for predicting the configurational energy of high entropy alloys.
Accuracy and robustness of the model are improved via physical feature selection with Bayesian information criterion.
Uncertainty of the Bayesian regression model is quantified, with robust performance demonstrated.
Utilize a large size of the configurational system and more than 1000 DFT data calculated by LSMS
Last Updated: November 11, 2020 - 5:03 pm