Highlight

Robust machine learning modeling of high entropy alloy

quare lattice with effective pair interaction highlighted. (a) The nearest-neighbor pair is marked in blue, while the next nearest-neighbor pair is marked in yellow; (b) the pair marked in green, pink and red correspond to the 3rd, 4th and 5th neighbor respectively. Equivalent interacted pairs (same distance) are marked in the same color.
Square lattice with effective pair interaction highlighted. (a) The nearest-neighbor pair is marked in blue, while the next nearest-neighbor pair is marked in yellow; (b) the pair marked in green, pink and red correspond to the 3rd, 4th and 5th neighbor respectively. Equivalent interacted pairs (same distance) are marked in the same color.

Achievement

The development of machine learning sheds new light on the traditionally complicated problem of thermodynamics in multicomponent alloys. Successful application of such a method, however, strongly depends on the quality of the data and model. Here we propose a scheme to improve the representativeness of the data by utilizing the short-range order (SRO) parameters to survey the configuration space. Using the improved data, a pair interaction model is trained for the NbMoTaW high entropy alloy using linear regression. Benefiting from the physics incorporated into the model, the learned effective Hamiltonian demonstrates excellent predictability over the whole configuration space. By including pair interactions within the 6th nearest-neighbor shell, this model achieves an R2 testing score of 0.997 and root mean square error of 0.43 meV. We further perform a detailed analysis on the effects of training data, testing data, and model parameters. The results reveal the vital importance of representative data and physical models. On the other hand, we also examined the performance neural networks, which is found to demonstrate a strong tendency to overfit the data.

Publications:

 

 

Significance and Impact

  • A data-driven framework is proposed for predicting the configurational energy of high entropy alloys.

  • Accuracy and robustness of the model are improved via physical feature selection with Bayesian information criterion.

  • Uncertainty of the Bayesian regression model is quantified, with robust performance demonstrated.

  • Utilize a large size of the configurational system and more than 1000 DFT data calculated by LSMS

Last Updated: November 11, 2020 - 5:03 pm