Nonlinear comprehensive evaluation method based on information entropy and discrimination optimization

Guanjun Xu; Xijun Zeng

doi:10.1515/nleng-2025-0154

Enjoy 40% off

academic books on De Gruyter Brill *

Article Open Access

Nonlinear comprehensive evaluation method based on information entropy and discrimination optimization

Guanjun Xu and Xijun Zeng

Published/Copyright: June 20, 2025

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Nonlinear Engineering Volume 14 Issue 1

Abstract

In the comprehensive evaluation of multiple indicators, the existence of nonlinear relationships often makes the discrimination of traditional evaluation methods unstable or difficult to accurately reflect the complex relationship between indicators. The outcome of a comprehensive evaluation is to assign a quantitative value to each evaluation object for selection or ranking. The discriminability in the evaluation results is an important measure of the effectiveness of the comprehensive evaluation. In response to the issues of uncertain discriminability or poor stability in commonly used comprehensive evaluation methods, a comprehensive evaluation method prioritizing discriminability is proposed. Based on the principle that information entropy can reflect the degree of variation in the evaluation dataset, a model for quantitatively analyzing the discriminability of evaluation indicators is provided, and the conclusion that low-discriminability indicators will reduce the overall discriminability of the comprehensive evaluation is proven. Accordingly, weighted indicators, qualification indicators, and invalid indicators are defined. By identifying and eliminating low-discriminability qualification indicators and invalid indicators, and retaining weighted indicators, the evaluation results ensure good discriminability while maintaining the comprehensiveness of the evaluation. Through empirical analysis, the scientificity and effectiveness of this method in processing multi-dimensional and nonlinear data are verified.

Keywords: discriminability; comprehensive evaluation method; indicator classification

1 Introduction

Comprehensive evaluation refers to the overall and holistic assessment of an object system described by a multi-attribute system structure, i.e., for the entire set of evaluation objects, a certain method is used to assign a quantitative value to each evaluation object based on the given conditions, and then to select the best or rank them [1]. This helps decision-makers accurately grasp the essence patterns of the evaluation objects, providing strong support for scientific decision-making. Comprehensive evaluation methods are widely applied in various fields such as education, economy, environment, society [2], and engineering evaluation [3,4]. Comprehensive evaluation methods are based on the construction of an evaluation indicator system and apply mathematical models to quantify evaluation results. The commonly used comprehensive evaluation method is the simple additive weighting model [5]. The basic principle of this method is to allocate corresponding weights according to the importance and influence of each evaluation indicator on the evaluation target, thereby obtaining objective and reasonable quantitative evaluation results [6]. This requires that the comprehensive evaluation method, while systematically reflecting the multi-dimensional characteristics of the evaluation objects around the evaluation target, also ensures that the evaluation results have good discriminability, i.e., it can effectively distinguish the comprehensive levels of the evaluation objects through quantified results with certain differences, especially the level differences between different grades of evaluation objects [7]. Comprehensiveness and discriminability [8] are to some extent mutually restrictive. Generally, due to the personalized differences of each indicator of the evaluation objects, the discriminability of comprehensive evaluation decreases with the increase of evaluation indicators. How to balance the two is an important aspect to consider in comprehensive evaluation methods.

Common comprehensive evaluation methods include subjective weighting methods, objective weighting methods, and combined evaluation methods [9]. The weight determination of subjective weighting methods is based on the subjective judgment of experts [10] and does not depend on the evaluation indicator dataset, providing good stability. However, the evaluation process does not incorporate information from the evaluation dataset, and the discriminability of the evaluation results is uncertain. Objective weighting methods determine the weights based on the degree of variation in the evaluation indicator dataset. Generally, the larger the indicator weight, the greater the discriminability of its dataset. Therefore, the evaluation results have good discriminability. However, in this method, the indicator weights, which are the core of the evaluation method, change with the evaluation dataset, resulting in poor stability. Moreover, in many cases, evaluation indicators with a large degree of variation in the dataset are not necessarily the most important indicators that reflect the evaluation target. Combined evaluation methods [11,12] are a research hotspot in comprehensive evaluation. Combined evaluation methods integrate subjective and objective weighting methods. Therefore, their weights to some extent reflect the discriminability of the evaluation indicator dataset. However, since these methods are more problem-oriented and the models are more complex, they lack universality and simplicity [13]. The discriminability of the evaluation results is highly related to the positioning of the objective weighting method in the model and is also uncertain.

Based on the aforementioned comparison, although objective weighting methods have good discriminability, their instability makes them difficult to apply in routine comprehensive evaluations. Subjective weighting methods and combined evaluation methods have good stability, but the discriminability of these two methods is uncertain and needs further improvement. The research content of this article is to construct an evaluation method with good discriminability and comprehensiveness, based on the given comprehensive evaluation indicator system, indicator weights, and evaluation dataset. Inspired by the entropy weight method, the principle that information entropy can reflect the degree of variation in the evaluation dataset is applied to define the discriminability of evaluation indicators. Conclusions on how weighted low-discriminability evaluation indicators reduce the discriminability of comprehensive evaluation results are provided, leading to a comprehensive evaluation method prioritizing discriminability. By involving important indicators in the evaluation, the comprehensiveness of the evaluation is reflected, and by eliminating low-discriminability indicators, the discriminability of the evaluation results is ensured. The method presented in this study, which has been validated through both a material selection case analysis and synthetic evaluation datasets, exhibits good evaluation discriminability and effectiveness.

2 Quantitative analysis of evaluation indicator discriminability

The fundamental approach of the proposed method is to sequentially weight important evaluation indicators based on their discriminability priority quantification values, given an evaluation indicator system, indicator weights, and an evaluation dataset. By identifying and eliminating low-discriminability evaluation indicators, the method ensures that the evaluation results have good discriminability. The following analysis focuses on the quantification of discriminability and the impact of evaluation indicators on discriminability, and provides a method for indicator classification.

2.1 Quantification of discriminability

The qualitative description of discriminability is frequently mentioned in comprehensive evaluations, and there are various methods for its quantification. The traditional method for quantifying discriminability is the discriminability index, which is applied in educational measurement to assess whether test items can effectively distinguish the ability levels of different students [14]. Let the evaluation dataset be x and the maximum score be T. Select a certain percentage of the highest scores from x as the high-score group, and calculate its average value x _H. Similarly, select the same percentage of the lowest scores as the low-score group, and calculate its average value x _L. Common percentages used are 27 and 33%. The quantitative definition of the discriminability index is as follows:

(1) D ( x ) = x H − x L T .

Guo et al. [15] proposed using the range or variance to measure the discriminability of comprehensive evaluations. Subsequently, other scholars have proposed quantification methods such as standard deviation and deviation [16,17]. To comprehensively reflect the discriminability between any two evaluation objects, Li and Gao [8,18] improved the aforementioned methods and proposed a discriminability function defined by all evaluation data. The method first sorts the comprehensive evaluation result dataset from low to high, obtaining x ^* = (x ₁ ^*, x ₂ ^*,…, x _n ^*), and the quantification is calculated as follows:

(2) D ( x ⁎ ) = 1 n ln [ 1 + ( x n ⁎ − x 1 ⁎ ) ] + ∑ i = 1 n − 1 ln [ 1 + ( x i + 1 ⁎ − x i ⁎ ) ] .

The aforementioned methods for quantifying discriminability require sorting of the evaluation results, which makes them difficult to apply in the general theoretical derivation of discriminability. In the concept of the entropy weight method, the smaller the information entropy of a dataset, the greater its degree of variation, which implies a better discriminability. Let the normalized standard vector be x = (x ₁, x ₂,…, x _n), where x _i ∈ [0, 1] and ∑ i = 1 n x i = 1 . The information entropy is defined as H(x) = − ∑ i = 1 n x i ln x i . The following definition is provided based on information entropy.

Definition: For the normalized standard vector x = (x ₁, x ₂,…, x _n) with information entropy H(x), the discriminability is defined as follows:

(3) D ( x ) = 1 − H ( x ) ln n ,

where x is the normalized standard vector of the evaluation indicator dataset, hereinafter referred to as the indicator vector; D(x) is called the indicator discriminability; when x is the normalized standard vector of the comprehensive evaluation result dataset, hereinafter referred to as the comprehensive evaluation vector; D(x) is called the comprehensive evaluation discriminability.

From Eq. (3), it is known that D(x) ∈ [0, 1]. When x = (1/n, 1/n,…, 1/n), D(x) = 0. For ∀i ∈ {1, 2,…,n}, when x _i = 1, for example, x = (1, 0,…,0), in the calculation of information entropy, it is defined that 0ln0 = lim x → 0 x ln x = 0, and in this case, D(x) = 1. Consider a two-dimensional evaluation dataset x = (p, 1 − p) and y = (q, 1 − q). If p > q ≥ 0.5, it is easy to prove that D(x) > D(y). This indicates that the more evenly distributed the evaluation dataset, the lower the comprehensive evaluation discriminability; the greater the distribution differences in the dataset, the higher the comprehensive evaluation discriminability. Moreover, Eq. (3) is highly sensitive to high-partition distributions; thus, this definition can effectively reflect the ability to distinguish excellent groups in selective comprehensive evaluations.

2.2 Impact of evaluation indicators on comprehensive evaluation discriminability

In this section, it is demonstrated under the discriminability definition given by Eq. (3) that in comprehensive evaluations, weighting with low indicators discriminability dataset will reduce the comprehensive evaluation discriminability. Additionally, a lower bound for the rate of decrease in comprehensive evaluation discriminability is estimated. This estimation theoretically ensures the feasibility of the method presented in this article. The following analysis is conducted for an m-dimensional comprehensive evaluation problem with indicator vectors p ₁, p ₂,…,p _m and corresponding weights ω ₁, ω ₂,…,ω _m.

Lemma: The discriminability D(•) defined by (3) is concave, i.e., for m groups of indicator vectors in comprehensive evaluation given by p _i = (p _1,i, p _2,i,…,p _n,i), for any set of weights ω _i ∈ (0, 1) with ∑ i = 1 m ω i = 1, then D ( ∑ i = 1 m ω i p i ) ≤ ∑ i = 1 m ω i D ( p i ) .

Proof: For any set of weights ω i ∈ ( 0 , 1 ) , ∑ i = 1 m ω i = 1 ,

∑ i = 1 m ω i p i = ∑ i = 1 m ω i ∣ p i ∣ = ∑ i = 1 m ω i = 1 , the weighted indicator vectors remains a normalized standard vector,

Let f ( x ) = − x ln x , ( 0 < x < 1 ) , then H ( p i ) = ∑ j = 1 n f ( p j , i )

Since f ' ' ( x ) = − 1 x < 0 . That is, f(x) is a convex function, by Jensen’s inequality, we have f ( ∑ i = 1 m ω i p j , i ) ≥ ∑ i = 1 m ω i f ( p j , i )

By summing the above inequality for j from 1 to n, we get: ∑ j = 1 n f ( ∑ i = 1 m ω i p j , i ) ≥ ∑ j = 1 n ∑ i = 1 m ω i f ( p j , i ) = ∑ i = 1 m ω i ∑ j = 1 n f ( p j , i )

Therefore,

D ∑ i = 1 m ω i p i = 1 − 1 ln n H ∑ i = 1 m ω i p i = 1 − 1 ln n ∑ j = 1 n f ∑ i = 1 m ω i p j , i ≤ 1 − 1 ln n ∑ i = 1 m ω i ∑ j = 1 n f ( p j , i ) = ∑ i = 1 m ω i − 1 ln n ∑ i = 1 m ω i ∑ j = 1 n f ( p j , i ) = ∑ i = 1 m ω i 1 − 1 ln n ∑ j = 1 n f ( p j , i ) = ∑ i = 1 m ω i D ( p i )

That is D ( ∑ i = 1 m ω i p i ) ≤ ∑ i = 1 m ω i D ( p i ) .□

Theorem 1: If an evaluation indicator with very low indicator discriminability is added, it will reduce the comprehensive evaluation discriminability. Suppose the original weighted comprehensive evaluation discriminability is D(p ₀), and a new evaluation indicator vector p ₁ with very low indicator discriminability is added, i.e., D(p ₁) < D(p ₀). Then, for any ω 0 ⁎ , ω 1 ⁎ ∈ ( 0 , 1 ) , ω 0 ⁎ + ω 1 ⁎ = 1 , the new weighted comprehensive evaluation vector p = ω ₀ ^* p ₀ + ω ₁ ^* p ₁ will have D(p) < D(p ₀).

Proof: From the lemma, we have D ( p ) = D ( ω 0 ⁎ p 0 + ω 1 ⁎ p 1 ) ≤ ω 0 ⁎ D ( p 0 ) + ω 1 ⁎ D ( p 1 )

Since D ( p 1 ) < D ( p 0 )

it follows that ω 0 ⁎ D ( p 0 ) + ω 1 ⁎ D ( p 1 ) < ω 0 ⁎ D ( p 0 ) + ω 1 ⁎ D ( p 0 ) = ( ω 0 ⁎ + ω 1 ⁎ ) D ( p 0 )

Therefore D ( p ) < D ( p 0 ) .□

Theorem 2: Let the comprehensive evaluation discriminability be D(p ₀), and a new indicator vector p ₁ is added. For any ω 0 ⁎ , ω 1 ⁎ ∈ ( 0 , 1 ) , ω 0 ⁎ + ω 1 ⁎ = 1 , the new weighted comprehensive evaluation vector is p = ω 0 ⁎ p 0 + ω 1 ⁎ p 1 . Let η = D(p ₁)/D(p ₀), the comprehensive evaluation discriminability decreased rate of p relative to p ₀ is Δ = (D(p ₀) − D(p))/D(p ₀). Then, it holds that Δ ≥ ω 1 ⁎ ( 1 − η ) .

Proof: From the lemma, we have D ( p ) ≤ ω 0 ⁎ D ( p 0 ) + ω 1 ⁎ D ( p 1 )

This implies D ( p ) ≤ (1 − ω 1 ⁎ ) D ( p 0 ) + ω 1 ⁎ D ( p 1 )

Rearranging the terms, we get D ( p 0 ) − D ( p ) ≥ ω 1 ⁎ [ D ( p 0 ) − D ( p 1 ) ]

Dividing both sides by D ( p 0 ) , we obtain

D ( p 0 ) − D ( p ) D ( p 0 ) ≥ ω 1 ⁎ D ( p 0 ) − D ( p 1 ) D ( p 0 ) = ω 1 ⁎ 1 − D ( p 1 ) D ( p 0 ) ,

i.e., Δ ≥ ω 1 ⁎ ( 1 − η ) .

According to Theorem 2, the lower bound of the discriminability decline rate is directly proportional to the parameter (1 − η). When η < 1, the smaller the value of η, the closer the lower bound of the discriminability decline rate approaches ω ₁ ^* when a new indicator is added. In practical evaluations, the discriminability D(x) varies with different datasets of the evaluation objects. Therefore, it is reasonable to assess the impact of evaluation indicators on the discriminability of evaluation results using the relative value of the discriminability ratio η.

If an evaluation indicator with indicator discriminability greater than the original comprehensive evaluation discriminability is added, then based on the ranking distribution of the new indicator vector, there are two scenarios: ① if the ranking distribution is consistent with the original comprehensive evaluation vector’s ranking distribution, it will enhance the comprehensive evaluation discriminability, and ② if the ranking distribution is in the opposite direction to the original comprehensive evaluation vector’s ranking distribution, although it will significantly reduce the comprehensive evaluation discriminability, it will, however, effectively reflect the comprehensiveness of the evaluation. Therefore, evaluation indicators with higher indicator discriminability are considered ideal indicator types.

2.3 Indicator classification method

Based on the prior conclusions, it is necessary to select indicator with higher indicator discriminability for weighting in order to ensure the comprehensive evaluation discriminability. In the following, a qualitative description of the classification of evaluation indicators is provided:

Weighted indicators: indicators with higher indicator discriminability. The most ideal weighted indicators are those with relatively larger weights. Weighted indicators participate in the quantitative calculation of comprehensive evaluations.

Qualification indicators: indicators with lower indicator discriminability but relatively larger weights. Generally, these indicators are highly relevant to the evaluation objectives. They can be set at different levels, such as excellent and good, as mandatory qualification conditions that the evaluation objects must meet.

Invalid indicators: Indicators with both lower indicator discriminability and smaller weights. Such indicators have a minimal impact on the results of comprehensive evaluations and can be excluded from the comprehensive evaluation process, serving only as a reference for evaluation.

For both qualification and invalid indicators, the evaluating entity needs to check the scientific nature of the quantification of the evaluation dataset and consider whether to improve the quantification method to enhance their indicator discriminability.

The weight and discriminability of an indicator are both positively correlated with the importance of the evaluation indicator, and they exhibit a multiplicative effect. A synergy model can be applied to define the discriminability priority quantification value, which is defined as ω _i D(p _i) [19]. Indicators with larger values of this quantification are prioritized for weighted quantification in the evaluation process. Suppose p ₁, p ₂,…, p _m are the vectors of evaluation indicators sorted by the discriminability priority quantification value, i.e., ω ₁ D(p ₁) ≥ ω ₂ D(p ₂) ≥ … ≥ ω _m D(p _m). Weight each weighted indicator in sequence. Let the comprehensive evaluation vector of the first k weighted indicators be p ₀ = ∑ i = 1 k ( ω i / ∑ i = 1 k ω i ) p i , after adding the indicator vector p _k+1, the comprehensive evaluation vector becomes p = ( ∑ i = 1 k ω i / ∑ i = 1 k + 1 ω i ) p 0 + ( ω k + 1 / ∑ i = 1 k + 1 ω i ) p k + 1 , and let η = D(p _k+1)/ D(p ₀). According to Theorem 2, when η → 0, the decrease rate of D(p) can be estimated as ≥ ω 1 * ( 1 − η ) = ( ω k + 1 / ∑ i = 1 k + 1 ω i ) ( 1 − η ) ≥ ω k + 1 ( 1 − η ) → ω k + 1 .

Based on the aforementioned analysis, the quantitative method for classifying the three types of indicators is as follows: let the threshold of η be α. When η ≥ α, the newly added indicator has high indicator discriminability and is classified as a weighted indicator. When η < α, the newly added indicator has low indicator discriminability, and a threshold β for the weight of the indicator is introduced. In practical applications, β can be taken as 1/(2m), which is half of the average weight of the indicators: if ω _k+1 ≥ β, it indicates that the weight of the indicator is large, and the lower bound of the rate of decrease in comprehensive evaluation discriminability after weighting this indicator is approximately ω _k+1, which will significantly reduce the comprehensive evaluation discriminability, and it is classified as a qualification indicator; if ω _k+1 < β, it indicates that the weight of the indicator is small, and the indicator is classified as an invalid indicator.

Next, numerical experiments are carried out to determine the suitable initial value of the threshold α for η. In the experiments, η is taken as 0.05, 0.1, 0.15, 0.2, 0.25, and 0.3. Multiple sets of evaluation datasets are randomly generated to examine the influence of the indicator discriminability and weight of the newly added evaluation indicators on the comprehensive evaluation discriminability. In the experiments, the weight ω of the newly added indicators ranges from 0.05 to 0.3, and the changes in the rate of decrease of comprehensive evaluation discriminability Δ are depicted in Figure 1.

Figure: 1

Percentage decrease of discriminability with the newly added indicator: (a) proposed discriminability, (b) the lower bound estimated by Theorem 2, (c) discriminability index, and (d) discriminability function of reference [8].

Based on the experiments, the following conclusions can be drawn: ① the discriminability defined by Eq. (3) (as shown in Figure 1a) and those defined by Eqs. (1) and (2) (as shown in Figure 1c and d) both exhibit an approximately linear change trend with respect to η and the weight ω of the newly added indicator. This indicates a high degree of consistency among the different definitions of discriminability quantification. ② Figure 1a and b shows that as η decreases and the weight ω of the newly added evaluation indicator increases, the rate of decrease in comprehensive evaluation discriminability increases approximately linearly, which is consistent with the estimation of Theorem 2. However, the actual rate of decrease is higher than the estimated lower bound, and the error increases as η decreases. ③ From Figure 1a, it can also be observed that when η ≥ 0.25, the ratio of the rate of decrease in comprehensive evaluation discriminability to the weight of the newly added indicator is less than 1, i.e., the rate of decrease in comprehensive evaluation discriminability after adding a new indicator is relatively limited. Therefore, taking α = 0.25 as the initial value for the threshold of η is reasonably justified.

The initial threshold value of α = 0.25 for η is validated using the Cohen’s kappa consistency test. In the experiment, 1,000,000 datasets are generated, each comprising a 100 × 10 random matrix as the evaluation data and a 1 × 10 random vector as the corresponding weights. Each dataset represents the evaluations of 100 objects based on 10 evaluation indicators, with a data scale consistent with the order of magnitude of the number of evaluation objects and indicators commonly used in practical evaluations. The columns of the random matrix are composed of: two sets of low-discrimination indicators following the distributions N(100, 1) and N(100, 10), two sets of high-discrimination indicators following the distributions N(100, 90) and N(100, 99), and six sets of medium-discrimination indicators following the distribution N(100, 50).

The evaluation objects are sorted by their evaluation results and evenly divided into five categories: Grade A (top 20%), Grade B (20 to 40%), Grade C (40 to 60%), Grade D (60 to 80%), and the remaining Grade E. For each dataset, the original evaluation method is first applied, followed by weighting the evaluation indicators based on their discriminability priority quantification values. The kappa value κ is then calculated between the classification results of the weighted method and the original method. If κ ≥ 0.95, the improved evaluation method is deemed effective, and the corresponding η value is recorded. For the 1,000,000 datasets, the probability distribution of η is derived, and the threshold value of α ≈ 0.25 is calculated such that P(η > a) = 95%. The η distribution of the experiment is shown in Figure 2.

Figure 2

η distribution of κ ≥ 0.95.

3 Process of nonlinear comprehensive evaluation method based on information entropy and discrimination optimization

The proposed method is essentially an improvement of known evaluation methods given the evaluation dataset and indicator weights. By eliminating qualification and invalid indicators from the evaluation indicators and weighting only the weighted indicators, it ensures that the evaluation has good comprehensive evaluation discriminability.

In specific applications of comprehensive evaluation, if there are a large number of qualification and invalid indicators, it can affect the comprehensiveness of the evaluation. In such cases, the first step is to check the scientific nature of the quantification of the datasets corresponding to the qualification and invalid indicators. The second step is to introduce a parameter decrease rate coefficient λ (set λ = 0.8 to ensure the continuity and stability of the threshold variation.) for dynamic adjustment of the parameters α and β. Specifically, if the cumulative weight of invalid indicators exceeds 0.2, update the threshold α ← λα to increase the number of weighted indicators, and update the threshold β ← λβ to reduce the number of invalid indicators, thereby meeting the requirements for the comprehensiveness of the evaluation. The pseudocode of the proposed method is illustrated in Figure 3.

Figure 3

Pseudocode of the proposed method.

4 Case study

In this section, the proposed method is applied to comprehensive evaluations using both material selection evaluation data and synthetic evaluation data. The results are compared with those from conventional methods to verify whether the proposed method meets the expected standards in terms of discriminability and effectiveness of the evaluation.

4.1 Die-casting magnesium alloys selection evaluation

The material selection process involves numerous factors, necessitating a comprehensive evaluation of materials based on various performance aspects. Taking the selection of magnesium alloy materials as an example, this section evaluates five widely used magnesium alloys – AZ91, AM60, AM20, AE42, and AS41 – from the AZ (Mg–Al–Zn), AM (Mg–Al–Mn), AS (Mg–Al–Si), and AE (Mg–Al–Re) series, based on 21 indicators across three aspects: service performance, technological performance, and eco-economic performance. The weights of the indicators are shown in Table 1, and the specific indicator values can be found in ref. [4].

Table 1

Evaluation indicators and weights for die-casting magnesium alloy selection

No.	Primary indicator	Secondary indicator	Weight	No.	Primary indicator	Secondary indicator	Weight
1	Service performance	Oxidation activity	0.1531	11	Technological performance	Resistance to cold defects	0.0401
2		Corrosion resistance	0.1435	12		Air-tightness	0.0453
3		Impact toughness (J)	0.0653	13		Hot cracking sensitivity	0.0212
4		Strength (MPa)	0.0799	14		Flowability	0.0119
5		Stiffness (GPa)	0.0480	15		Non-Adhesion to molds	0.0119
6		Elongation	0.0480	16		Machinability	0.0071
7		Polishability	0.0292	17		Surface treatment capability	0.0918
8		Damping capability	0.0118	18		Plating performance	0.0290
9		Thermal conductivity	0.0036	19	Eco-economic performance	Environmental friendliness	0.0327
10		Electrical conductivity	0.0545	20		Recyclability and reusability	0.0394
				21		Economic viability	0.0327

In the original evaluation, there were 21 secondary indicators. However, only 14 weighted indicators were actually involved in the evaluation. Among them, indicators 2, 12, 17, 19, and 20 are qualification indicators, while indicators 8 and 15 are invalid indicators. Table 2 presents the comparative results of the comprehensive evaluation of the five die-casting magnesium alloys. In addition to the discriminability defined by Eq. (3), the discriminability index defined by Eq. (1) (with a weight of 33%) and the discriminability function defined by Eq. (2) were introduced for comparative analysis. The improvement percentages of the three types of discriminability definitions are 210, 55, and 87%, respectively.

Table 2

Comparative result of the comprehensive evaluation of die-casting magnesium alloys

Magnesium alloy	AZ91	AM60	AM20	AE42	AS41
Proposed method	0.2527	0.2167	0.1642	0.2161	0.1503
Original method	0.2336	0.2165	0.1763	0.2102	0.1633
Comprehensive ranking	1	2	4	3	5
Qualification indicators (Grade ≥ 4)	Meets	Meets	Meets	Meets	Meets

As shown in Table 2 and Figure 4, the comprehensive evaluation ranking results obtained using the proposed method are consistent with those of the original evaluation. However, the discriminability of the evaluation results is significantly enhanced. Consequently, the proposed method can accurately identify invalid indicators in comprehensive evaluations. If no qualification or invalid indicators are identified during the evaluation process, the proposed method will not alter the evaluation results. This also indicates that the original method already possesses good comprehensive evaluation discriminability.

Figure 4

Comparison of the comprehensive evaluation of die-casting magnesium alloys.

4.2 Synthetic data comprehensive evaluation

In accordance with the experimental design for analyzing the rationality of the initial threshold value α = 0.25 presented in Section 2.3, 10,00,000 sets of randomly generated data are used as simulated evaluation data for further experiments on discriminability enhancement and evaluation validity.

The median values of discriminability enhancement for the proposed discriminability, the discriminability from Reference [8], the 27% discriminability index, and the 33% discriminability index are 26.2, 11.5, 9.6, and 9.6%, respectively. The results of the discriminability enhancement distributions are presented in Figure 5.

Figure 5

Discriminability improvement of synthetic data evaluation. (a) Proposed discriminability improvement (%). (b) Ref. [8] discriminability improvement (%). (c) Discriminability index (27%) improvement. (d) Discriminability index (33%) improvement.

The validity of the proposed method is demonstrated through a Cohen’s kappa consistency test comparing the evaluation results of the proposed method and the original method on synthetic data. Figure 6 shows that among all the test data, 60.8% have κ = 1, 94.48% have κ ≥ 0.95, and 99.9% have κ ≥ 0.8.

Figure 6

Cumulative distribution function of synthetic data Cohen’s kappa consistency test.

5 Conclusion

This article first defines a discriminability quantification method based on information entropy, which has a clear physical meaning and does not require sorting of evaluation results during the calculation process, making it easier to derive related conclusions theoretically. Moreover, the indicator discriminability can reflect the scientific nature of the evaluation indicator quantification to a certain extent. Subsequently, it is proven that the comprehensive evaluation discriminability decreases as low-discriminability evaluation indicators are added, and the lower bound estimation of the rate of decrease in comprehensive evaluation discriminability is provided, laying the theoretical foundation for the method presented in this article. Qualitative analysis and numerical experiments are conducted to establish the classification criteria for weighted indicators, qualification indicators, and invalid indicators. Based on this, a nonlinear comprehensive evaluation method based on information entropy and discrimination optimization is proposed, which involves weighting only the weighted indicators to calculate the evaluation results. Finally, this method is applied to case studies of comprehensive evaluation using die-casting magnesium alloys performance data and randomly generated data. Through qualitative analysis and practical application, it is evident that the proposed method has a definite comprehensive evaluation discriminability and reliable effectiveness. In numerical experiments, the deviation in the high-scoring partition significantly increases, effectively reducing the chances of repeated scores in comprehensive evaluations, making it particularly suitable for selective comprehensive evaluations. In future research, the indicator classification method and the selection of method parameters can be further optimized using machine learning algorithms [20].

Acknowledgments

The authors acknowledge the Ministry of Education Humanities and Social Sciences Project (Grant: 23YJA880061), the First Batch of Teaching Reform Projects in the 14th Five-Year Plan of Higher Vocational Education in Zhejiang Province (Grant: jg20230410), and the Second Batch of Teaching Reform Projects in the 14th Five-Year Plan of Higher Vocational Education in Zhejiang Province (Grant: jg20240410).

Funding information: The authors state no funding involved.
Author contributions: Guanjun Xu: funding acquisition, conceptualization, investigation, methodology, writing – original draft. Xijun Zeng: data curation, formal analysis, validation, writing – review and editing. All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
Conflict of interest: The authors state no conflict of interest.
Data availability statement: The data sets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

[1] Riedel SL, Pitz GF. Utilization-oriented evaluation of decision support systems. IEEE Trans Man Cybern. 1986;16(6):980–96.10.1109/TSMC.1986.4309016Search in Google Scholar

[2] Du D, Pang QH. Modern comprehensive evaluation method and case selection. Beijing: Tsinghua University Press; 2021.Search in Google Scholar

[3] Sun JS, Wang X, Zeng DZ, Yang J, Li ZL, Luo JC, et al. A method for evaluating the applicability of nickel-based alloy tubing/casing in high-temperature, high-pressure, and high-sulfur gas wells. Nat Gas Ind J. 2024;44(11):1–10.Search in Google Scholar

[4] Zhao JH, Duan YL, Chu WH. Improve analysis hierarchy process and fuzzy synthetic judgment on the selection of die casting magnesium alloys. Die Cast Res. 2008;29(6):735–8.Search in Google Scholar

[5] Kaliszewski I, Podkopaev D. Simple additive weighting - A metamodel for multiple criteria decision analysis methods. Expert Syst Appl. 2016;54:155–61.10.1016/j.eswa.2016.01.042Search in Google Scholar

[6] Butler JD, Morrice DJ, Mullarkey PW. A multiple attribute utility theory approach to ranking and selection. Manag Sci. 2001;47(6):800–16.10.1287/mnsc.47.6.800.9812Search in Google Scholar

[7] Peng ZL, Zhang Q, Yang SL. Overview of comprehensive evaluation theory and methodology. Chin J Manag Sci. 2015;23(SI):245–56.Search in Google Scholar

[8] Li XQ, Gao XH. Research on the optimal dimensionless model in comprehensive evaluation methods. Stat Decis. 2024;40(5):44–9.Search in Google Scholar

[9] Li H, Zhu JP. Research progress on comprehensive evaluation methods. Stat Decis. 2012;9:7–11.Search in Google Scholar

[10] Saaty TL. Axiomatic foundation of the analytic hierarchy process. Manag Sci. 1986;32(7):841–55.10.1287/mnsc.32.7.841Search in Google Scholar

[11] Zhang MF. Combination evaluation methods and applications. Beijing: Science Press; 2018.Search in Google Scholar

[12] Steiger NM, Wilson JR. An improved batch means procedure for simulation output analysis. Manag Sci. 2002;48(12):1569–86.10.1287/mnsc.48.12.1569.438Search in Google Scholar

[13] Weick KE. Theory construction as disciplined imagination. Acad Manag Rev. 1989;14(4):516–31.10.2307/258556Search in Google Scholar

[14] Cureton EE. The upper and lower twenty-seven percent rule. Psychometrika. 1957;22:293–6.10.1007/BF02289130Search in Google Scholar

[15] Guo YJ, Ma FM, Dong QX. Analysis of the influence of dimensionless methods on deviation maximization method. J Manag Sci China. 2011;14(5):19–28.Search in Google Scholar

[16] Zhang R, Liu SF, Liu B. A general algorithm for objective weighting based on deviation maximization. Stat Decis. 2007;24:29–31.Search in Google Scholar

[17] Li ZR, Ma XJ, Peng ZL. Research on combination evaluation methods based on deviation maximization. Chin J Manag Sci. 2013;21(1):174–9.Search in Google Scholar

[18] Li XQ, Gao XH. Research on discriminability and stability of comprehensive evaluation results. Stat Decis. 2022;38(16):16–21.Search in Google Scholar

[19] Li YN. Research on the distinguish degree and weight designing of evaluation system based on entropy theory [dissertation]. Nanjing: Nanjing University of Aeronautics and Astronautics; 2008.Search in Google Scholar

[20] Luo WY, Li M, Cai JJ. Research on multi-period development evaluation model for online learning based on XGBoost algorithm. Educ Inf Technol. 2024;24(6):49–54.Search in Google Scholar

Received: 2025-02-10

Revised: 2025-05-12

Accepted: 2025-05-19

Published Online: 2025-06-20

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

https://doi.org/10.1515/nleng-2025-0154

Keywords for this article

discriminability; comprehensive evaluation method; indicator classification

Creative Commons

BY 4.0