Startseite Query optimization-oriented lateral expansion method of distributed geological borehole database
Artikel Open Access

Query optimization-oriented lateral expansion method of distributed geological borehole database

  • Qingjia Luo EMAIL logo
Veröffentlicht/Copyright: 8. Dezember 2023
Veröffentlichen auch Sie bei De Gruyter Brill

Abstract

In order to reduce the resource occupancy and retrieval efficiency of geological drilling databases, this study proposes a distributed horizontal expansion method for query optimization of geological drilling databases by constructing a comprehensive geological data subtree, analyzing the characteristics of distributed databases and elements in geological databases, and quickly retrieving data resources based on element attributes. In addition, this study has designed a method to horizontally extend the database designed for drilling holes using a multi-constraint model in order to achieve extension optimization of the distributed geological drilling database. Experiments are conducted to verify the performance and applicability of the proposed method. The experiment shows that when the geological data capacity is 80 GB, the capacity level of the geological database can be extended to 41 × 105TB using the method proposed in this study. The retrieval efficiency is higher than 89% and the resource occupancy rate is lower than 12% after the horizontal expansion of the database. By using this research method, the horizontal expansion of the geological drilling database is more effective, and can effectively reduce the resource occupancy rate and retrieval efficiency of the geological drilling databases. This has value significance for geological drilling with efficiency improvement and development.

1 Introduction

Geological drilling data is important at the national construction level. Every year, tens of thousands of newly designed drilled holes are used in geological engineering evaluations and geological surveys. The preservation and sharing of drilling data should give full play to geological data and tap their value; therefore, there is an urgent need to generate economic growth [1]. In the era of rapid development of big data, relevant application research on drilling big data has been incorporated into planning at the national data strategy level. Drilling data has the characteristics of a large volume and a fast growth rate. The existing management method does not use data sharing, which greatly reduces the value output of drilling data. Moreover, the existing geological database system only organizes complex and ever-changing geological drilling data and establishes a standardized and consistent engineering geological information database. Due to the complexity of the information, the resource occupancy of the geological drilling database is high, resulting in system stagnation, slow operation, and low retrieval efficiency. Therefore, the study of drilling data organization and query optimization methods and the establishment of a high-availability drilling data service platform have a promoting effect on faster data acquisition and dissemination and have important theoretical and practical significance.

Geological databases have been successfully developed abroad. Economically developed Western countries, including the United States, Canada, France, Germany, Australia, and Russia, have established dedicated geological databases. Some of these have enabled the online retrieval of geological and mineral data, along with commercial services [2]. Since the United States Federal Geological Survey began to implement the digitization project of geological and mineral data and the construction of geological and mineral resource databases in 1980, it has invested a huge amount of money in the construction of the digitization engineering system of mineral and energy resource data [3,4,5]. After years of efforts, important achievements have been made. Many important geological and mineral resource databases have been established, including the United States Mineral Resources Database, such as National Coal Database, National Hydrological Data Storage and Retrieval System, Marine Mineral Resources Database, and Geochemical and Rock Analysis Database. The construction of these databases has provided great help to the digital project construction of mineral data in the United States, realized the sharing of geological and mineral data, and provided decision support for the formulation of geological and mineral strategic planning in the United States [2,6,7].

The system construction of China’s geological and mineral data began approximately in the last century and, to date, the system construction has accumulated some experience. As early as in the geological mining sector, for example, planning the construction of the national geology and mineral resources information system, in accordance with the plan, the national geology and mineral resources information system including a geoscientific information subsystem, multiple national databases of geology and mineral resources, and geological data model library, so far, the comprehensive geological database and foundations to hundreds of geological databases. These include geological exploration databases, mineral resources databases, and other geological databases. In addition, since the beginning of the century, a special geological science database has been started, and the geological science database is also one of them. These geological databases built in China are characterized by a complete range of specialties, including geology, physics, chemistry, and remote technology, and can accommodate the storage and management of massive geological data. Moreover, the construction of standardized operations points to the forward direction for the construction of China’s geological and mineral information system, that is characteristic development [8,9].

In terms of geological and mineral data optimization, Luo and Zhang [10] proposed a frequent item data query method for an extreme value perturbation optimization privacy framework. In order to accurately query the target frequent item data, this method adopts a privacy framework to query the hidden frequent items in the uncertain data. According to the time-varying nonlinearity of the frequent item data, the corresponding perturbation operator is designed, the data are selected uniformly in time, and the privacy framework is integrated to accurately extract the hidden frequent item data in the target data. Peters et al. [6] studied Macrostrat, a relational geospatial database and supporting network infrastructure. A large number of quantitative results have been produced in Macrostrat, and its infrastructure is used as a data platform for multiple independently developed mobile applications. It has therefore expanded its geographic coverage and refined age models and material properties to more accurately characterize the upper crust around the globe. Ping [11] proposed a conceptual model of plate tectonics to reconstruct geological databases. In the model, a space of borehole geological data was constructed based on the mapped features. Using this method, abstract structural features can be mapped to different geological objects, and the corresponding data can be retrieved.

Although China’s mineral resources data digitization construction has obtained certain achievements, it has also accumulated a lot of experience; however, compared with the foreign digital construction level, there is a big gap, mainly in the following respects. First, the scale universalization: abroad, a specialist team was basically set up for each mine geological database to manage the data of geology and mineral resources, and in our country is in its infancy; there are important mineral resources that do not establish a geological database. Second, some existing geological databases are not highly specialized because they simply complete the electronic storage of geological and mineral data and fail to develop a geological application-oriented database application system for the specialization of geological data [12,13,14]. Third, the Internet is not fully utilized. Many geological and mineral databases are only self-produced and self-sold without realizing the sharing of important mineral data. Existing data need to be collected repeatedly, resulting in the waste of resources and manpower [15]. Therefore, the information data of geology and mineral resources of digital engineering construction are very necessary, for the scientific management of geology and mineral resources in our country, to make important contributions to the sustainable development of the country’s long-term strategic planning and government decision-making to provide decision support, which will greatly improve the efficiency of geological professional staff work and realize the rational use of resources. There is still a lot of work to be done in the construction of digital engineering of geological and mineral materials in China, and more professional geological workers are needed to be involved in the construction of digital engineering in the country to make their contributions. Therefore, this study proposes a query optimization-oriented lateral expansion method for distributed geological borehole databases [16]. In this article, a comprehensive geological data subtree is constructed, and the characteristics of the distributed database and the elements in the geological database are analyzed. On this basis, a multi-constraint model is constructed, and the distributed geological drilling database expansion optimization is realized through query optimization [17].

2 Comprehensive geological data subtree construction

Distributed databases. Through a high-speed interconnection network, several centralized databases which are physically dispersed are connected to form a database cluster, and the logical unified database service is provided externally depending on the cooperative work of each node in the cluster. The main idea of a distributed database is to distribute the data stored in a single database to multiple nodes in the cluster to increase the storage capacity and concurrent visits [18,19]. Distributed databases mainly have the following characteristics:

  1. Scalability. Distributed databases can scale to hundreds or thousands of clusters, and as the cluster size grows, the overall performance of the system increases linearly.

  2. Low cost. The fault-tolerant and load-balancing mechanism of distributed databases makes it possible to build on ordinary PCs, and the good support for linear expansion makes the operation of cluster expansion very convenient, which greatly reduces the cost of the operation and maintenance process.

  3. High performance. For a distributed database, whether it is a single server in the cluster or the whole cluster, it has high performance, can respond to large-scale read and write requests in time, and can randomly read and write massive data.

  4. High availability. A distributed database provides a good fault-tolerant mechanism and guarantees the high availability of data and services through distributed storage of data and redundancy backup of data in the cluster.

2.1 Geological data element

Data model elements are defined as geological data elements in comprehensive geological data. When analyzing the data of geological and mineral resources, the geological data elements can be abstracted into five basic data according to the line classification method, namely, geological map, geological table, geological report, mineral resources and file classification. Among them, geological maps, geological documents, and geological forms are entities, which contain both entity data and attribute data, while geological reports and classifications are abstract concepts, which contain only attribute data and not entity data. These five data elements are the basic data units in geological and mineral data [15]. In order to describe them formally, the five data elements of mineral data can be divided into three categories: type identifier, attribute data, and entity data. Triples are mainly used to store sparse matrices as a compression method. This method can reduce the storage memory occupied by the computer, and shorten the access and operation time. In order to formalize their description, the five data are defined in the form of the triplet. The specific definitions are as follows:

(1) DZ data Element = Type , Meta Data , Entiy Data ,

where Type is a type identifier used to determine the type of geological data elements, corresponding to one of the five basic data units. For certain geological data, it can only be one kind of geological map, geological document, geological table, geological report, and classification, and has certainty [20].

MetaData is the attribute data of geological data elements, which describes the attribute information of geological data elements. EntiyData is the entity data of geological data. Geological maps, geological forms, and geological documents can contain entity data so they have value. For geological and mineral data with a wide variety and a large amount of data, the abstracted above five basic geological data elements can be used for data classification and description. These five geological data elements are the model elements of the integrated geological data model, which can be used to describe the static structure and dynamic operation of the model [21].

2.2 Integrated geological data tree

Mines and minefields comprise a broad range of geological data that play a key role in the prospecting and extraction of mineral deposits and sites, as well as in management decision-making. This study proposes a comprehensive geological data model, which facilitates the systematic management of geological data according to the model, and thus improves the efficiency of geological data utilization. The model is proposed as a geological data model (20) that organizes geological data in an orderly manner, enhancing the efficiency of the geological data use.

After a comprehensive geological data element, the static structure of the comprehensive geological data model can be described. Using an integrated geological data tree, the comprehensive geological data tree follows a tree data structure with the five basic geological data elements as nodes, which are used to detail the characteristics of geological and mineral resource information data as well as the mutual constraint relationship between them. Different from ordinary trees, a comprehensive geological data tree is actually a deformed tree structure, and its geological data elements are not only related to parent nodes and child nodes but also have the following three constraint relationships among geological data elements in a comprehensive geological data tree [14]:

  1. As the entity data of the smallest unit of comprehensive geological data elements, geological maps, geological forms, and geological documents cannot be derived from other types of comprehensive geological data elements. If the geological data element is one of the types of geological maps, geological forms, and geological documents, it cannot have successors.

  2. As two conceptual entities, geological reports and classification can have successors and precursors. If they have precursors, their precursors can only be classification, not other geological data elements.

  3. If there is a successor to the geological report data element, the subsequent can only be three types, geological map, geological document, and geological table, instead of classification, and the successor to the classified data element can be any five geological data elements.

Figure 1 depicts the geological data tree before and after extension. Figure 1(a) illustrates the geological data tree, while Figure 1(b) illustrates the number of extended geological data instances.

Figure 1 
                  Examples of geological data tree expansion before and after. (a) Example of a geological data tree and (b) an example of an extended geological data tree.
Figure 1

Examples of geological data tree expansion before and after. (a) Example of a geological data tree and (b) an example of an extended geological data tree.

With the above three constraint relations, the formal definition of a comprehensive geological data tree can be given as

(2) DZdataTree = ( D , R ) ,

where D represents a dataset that combines geological data elements. R represents the relation set of D , which can be specifically expressed as:

  1. If R is empty, D means there are no data, then the comprehensive geological data tree is an empty geological data tree;

  2. If D has only one geological data element, then there is no definition of relation, then R is null;

  3. If D contains two or more geological data elements, R = {H} exists, where H is the binary relationship in the following three comprehensive geological data tree cases:

D has one and only one geological element as the root node of the comprehensive geological data tree (there is no precursor for this geological data element). If the geological data element is a geological map, a geological table, or a geological document, it has no successor. If it is a geological report element, then its successor can only be a geological map, a geological document, or a geological table.

If D -{root} φ there is a division of D -{root} D 1 , D 2 , , D m ( m > 0 ) for any j k ( 1 j , k m ) . Then, D j D k = φ and for any i k ( 1 i m ) , the only geologic data element x i D i has root, x i } H , where m is the number of tuples accessed, k is the number of nodes in the connection tree, and j is the node attribute coefficient.

For D -{root} partition H - { root, x 1 , , root, x m } , there is a unique partition H 1 , H 2 , , H m ( m > 0 ) , for any j k ( 1 j , k m ) , there is H j H k = φ , and for any i ( 1 i m ) , H i is a binary relation on D i , ( D i , { H i } ) is a comprehensive geological data tree in accordance with this definition, called the comprehensive geological data subtree of root . Thus, the construction of the distributed geological drilling data subtree is completed. Based on the above-mentioned database storage tree structure, the horizontal expansion method of the database is designed.

3 Lateral expansion method of the distributed geological drilling database

According to the characteristics of the distributed geological drilling database, the storage tree structure model of the database is constructed. On the basis of this model, the horizontal expansion and optimization of the database are carried out [22].

3.1 Principle of the lateral expansion method

According to different fields, different aspects of geological information are integrated to form a complete geological drilling database, which provides valuable reference data for various industries. Database expansion is the ability to continuously increase the system’s carrying capacity, and the horizontal expansion of the database is based on variables such as data volume and a data set of geological data road by read-write separation, vertical slice, and horizontal slice by putting different data in different nodes. This work thus calculates the spatial location parameter to come up with the efficiency of horizontal expansion.

The calculation process of horizontal expansion of the geological database is as follows:

Set the data quantity of the geological drilling database as w , the number of data attributes as g , and the geological drilling data set composed of all geological drilling databases is { e 1 , e 2 , , e w } , where e i is the ith data in the geological drilling database. The data set composed of all geological drilling data attributes is { h 1 , h 2 , , h g } , where h i is the j th data attribute in the geological drilling database, and the updating speed of geological drilling data is η . The spatial location selects the data nodes according to the load weights in a probabilistic manner and thus the optimal spatial location. The spatial location parameters of the extended geological borehole data can be calculated by using the following formula:

(3) μ = ( e i h j ) 2 j i 2 + η × w .

The lateral expansion efficiency of the geological drilling database can be calculated by using the following formula:

(4) w = j i 2 + η μ e i 2 h j 2 .

According to the method described above, the horizontal expansion of the geological drilling database can be carried out to provide data support for different industries.

3.2 Horizontal expansion optimization method of the distributed geological database

Using the traditional algorithm to optimize the horizontal expansion of the geological borehole database, it is impossible to avoid the defect of large differences in data attributes due to the large amount of data, which leads to a decrease of database query efficiency [23]. Therefore, a query optimization-oriented lateral expansion method of the distributed geological borehole database is proposed.

Building multiple constraint models. In the process of building the multi-constraint model, the diversity of borehole data attributes of the same geology is fully considered so the unnecessary cumbersome geological borehole data characteristics can be reduced. The steps are as follows: Pearson’s correlation coefficient used in this study to calculate the similarity of the data is calculated by dividing the covariance of the two variables by the product of their respective standard deviations, i.e. the similarity coefficient of the data in the Distributed Geological Borehole Database is calculated using the following formula:

(5) sum ( w b , w c ) = η i = 1 m w b c i = 1 m w b c 2 × j = 0 m h g .

Based on the method described above, the similarity degree of tedious and repeated data in the geological borehole database can be distinguished to describe the relevant data with strong similarity in the geological borehole database with the feature vector. The following formula can be used to calculate the variation parameters of the attribute characteristics of borehole data in different geology:

(6) e ( y , z ) = j = 1 m y j z j 2 .

If the value of s is changed, the difference in data attribute characteristics is strong. Assuming s = 1 , the distributed geological borehole data attribute characteristic value obtained by the above formula is a negative number. Therefore, the following formula is used for calculation:

(7) e ( y , z ) = j = 1 m y j z j ,

where y and z represent the characteristics of different data in the geological borehole database, and the distance between them can indicate the similarity between them:

(8) e ( y , z ) = [ ( y z ) 2 B ( y z ) ] 1 / 2 ,

where B is the non-negative definite matrix of data in the distributed geological borehole database.

Assuming B is the identity matrix, the above formula can be transformed into the following form:

(9) e ( y , z ) = j = 1 m y j z j 1 / 2 .

Assuming B is a diagonal matrix, the above formula can be transformed into the following form:

(10) e ( y , z ) = j = 1 m b j y j z j 1 / 2 .

Based on the above method, the data in the distributed geological borehole database can be processed, the data variables can be obtained, and more than 90% of the distributed geological borehole database information characteristics can be obtained [1]. The following formula can be used to establish a variety of constraint models, which can be used to optimize the lateral expansion of the distributed geological borehole database:

(11) Y = ( Y Y ¯ ) / T , T = j = 1 k ( Y j Y ¯ ) 2 p 1 .

Model constraints are conditions that must be met in order for the model to work and are of two types: internal constraints, which refer to conditions that must be met in order for the equations to be solved; and external constraints, which are mainly conventions made to make the model reflect the real situation as closely as possible. The internal constraints established according to equation (11) are the value within the square root operation must not be less than 0 and the denominator must be meaningful, i.e. p is not equal to 1.

The external constraints are as follows:

  1. Geological maps, geological tables, and geological documents, as the smallest unit of entity data in the integrated geological data element, cannot be derived from other types of integrated geological data units.

  2. Geological reports and classifications are two conceptual entities with both successors and predecessors.

  3. If the geological report data element has a successor, the successor can only be the geological maps, geological documents, and geological tables, and not classifications. The successor of the classification data element can be any five geological data elements.

According to the method described above, the data in the geological borehole database can be classified, and the information in the database can be divided into several different categories according to the characteristics of the data, which provides an accurate data basis for the expansion of the geological borehole database [1]. A variety of constraint models are established to judge the data categories that meet the query conditions, and the data query is carried out according to different data categories to realize the expansion and optimization of the geological borehole database.

3.3 Expansion and the optimization process of the distributed geological borehole database

In the constructed sub-tree structure of distributed geological drilling data, based on the tree structure of database storage, this work designs a method of database horizontal expansion and extended optimization. The extended optimization process of the distributed geological drilling database refers to the process of generating a query execution plan (QEP), in which the QEP should minimize the objective function, which is the time required for query execution in a distributed environment. This work focuses on joint operations. The query optimizer of the distributed geological borehole database consists of three parts: search scope of the geological database, generation of the horizontal expansion model of the geological borehole database, and query rules of the geological borehole database, as shown in Figure 2.

Figure 2 
                  Expansion optimization of the distributed geological database.
Figure 2

Expansion optimization of the distributed geological database.

Search scope of the distributed geological borehole database: A collection of alternative execution plans for query requests that can produce the same results. The execution sequence unique to each query plan, and the different ways in which various operations are implemented can result in different performances of execution plans. Execution plans are generally abstracted into tree of operations, where the nodes are operations and the shape of the tree determines the execution order of the operations. These operation trees are generated by the transformation rules of the query requests. These query operation trees are equivalent in that they can produce the same result set [3]. The horizontal expansion model of the distributed geological borehole database is generated. The cost model of the query optimizer includes cost function, data statistics information, intermediate result set estimation tool, etc. The query execution time is the primary benchmark for calculating the execution cost. For the distributed geological drilling database rules, the cost model is used to detect the execution plan. In the search space, the use of dynamic programming or simulated annealing strategy filter does not produce the optimal solution of the execution plan. It is required to check the plan in accordance with the price model to forecast the response time, and considering all the execution plans, finally, the best execution plan is determined.

The centralization or decentralization of the distributed geological borehole database expansion need not be taken into account during the search space stage since the execution plan is developed in accordance with specific transformation rules that are applicable under all circumstances.

4 Experiment

The data source is the geological map provided by the Zhejiang Province Survey and Mapping Bureau and the corresponding geological survey report. The map area is 120°15′−120°30′E and 30°40′−30°50′N [13]. The geological survey was conducted at 293 points at a depth of 3–5 m. A total of 293 slot auger coring survey points with a coring depth of 3–5 m were completed; 10 geological boreholes were completed with a total length of 1080.5 m and an average scoring rate of more than 90%, among which the clayey soil was >95% and the loose gravel and pebble layer was >85%. Three quaternary boreholes and four quaternary engineering and hydrology boreholes were completed. One shallow seismic longitudinal wave survey profile with a length of 11.44 km and 563 actual physical points was completed, and two shallow seismic transverse wave survey profiles with a length of 2.01 km and 338 actual physical points were completed [14]. The data server used in the experiment was Huawei tecal RH5585V2, and the CPU was E7-4807 × 4. The model of the web/document/GIS server was Huawei Tecal RH2288HV2. The NAS storage device model was FTC SANNAS 3000-I3160S. The drilling data service platform was developed based on C# language and .Net Framework 4.5. The software environment involved in the system included the distributed database extension CitusDB9.2 of Postgres12 and the spatial database extension PostGIS2.4. The application container engine Docker 19.03.1 was used to create a Linux image on a Windows Server 2010 server and CitusDB was used; GISServer was used to load the geographic, remote sensing, and topographic maps of the sky map. Web Services IIS 6.0. The database data and technology were free and open source.

4.1 Capacity of the distributed geological borehole database after lateral expansion

The expansion of database capacity is to ensure that it can be smooth, continue to provide external services, and ensure the availability of geological database services. In order to verify the effect of horizontal database expansion under different methods, the method in previous studies [1113] and the method in this study are used to detect the database capacity after horizontal expansion of the geological borehole database, and the results are shown in Figure 2. For ease of understanding and differentiation, the three methods mentioned above and the one in this study will be referred to as Methods A [12], B [13], C [11], and D (database horizontal expansion methods), respectively.

It can be seen from the analysis of Figure 3 that the lateral expansion capacity of geological databases under different methods is different. With the increase of the geological data capacity, the lateral expansion capacity of the geological database of this research method is always much higher than that of other methods and shows a stable upward trend [11,12,13]. If the geological data capacity is 60 GB, the maximum expansion capacity is approximately 46 × 105 TB. It can be seen that the horizontal expansion method of the database always has a high capacity after the horizontal expansion of the geological database, and the horizontal expansion effect is good.

Figure 3 
                  Capacity of the geological database after lateral expansion under different methods.
Figure 3

Capacity of the geological database after lateral expansion under different methods.

4.2 Retrieval efficiency of geological borehole database

Retrieval efficiency refers to the efficiency of database retrieval of the specified data. In this study, the difference is the amount of retrieved data, the retrieval time of the test method, and the automatic output of the detection efficiency value. In order to verify the retrieval efficiency of the geological borehole database after lateral expansion under different methods, this study uses Ping’s method [11], Mamoru’s method [13], Guo et al.’s method [12], and the method of this study to test the retrieval efficiency after lateral expansion of the geological borehole database, and the experimental results are shown in Table 1.

Analysis of Table 1 shows that the retrieval efficiency is different after horizontal expansion under different geological database storage capacities. With the increase in the storage capacity of geological databases, the retrieval efficiency of geological databases using this research method is mostly stable at over 90%, while the retrieval efficiency of geological databases using other methods is in the range of 40–65% [11,12,13]. It can be seen that after the horizontal expansion of geological databases, the retrieval efficiency of this research method is always higher than that of other methods, and it has a very good advantage in retrieval efficiency.

Table 1

Retrieval efficiency after lateral expansion of the geological borehole database under different methods

The amount of data (GB) Search efficiency after horizontal expansion (%)
A [12] B [13] C [11] D (database of horizontal expansion methods)
50 47 54 45 98
100 52 57 64 95
150 57 62 52 89
200 64 39 54 96
250 53 43 45 95
300 51 49 56 93

4.3 System resource occupancy rate after geological drilling database expansion

Resource usage is the usage of memory. The lower the resource occupancy rate, the better the application effect of the database expansion method. In order to verify the system resource occupancy rate after the horizontal expansion of the database under different methods, the research methods of Ping [11], Guo [12] and Mamoru [13] are adopted in this study. The occupancy rate of system resources is determined.

Analysis of Table 2 shows that there are differences in the system resource utilization rate after horizontal expansion under different methods. As the number of iterations increases, the resource utilization rate of the system using this research method remains stable within 15%, while the resource utilization rate of the system using other methods varies from 40 to 80% [11,12,13]. It can be seen that after the expansion of the geological drilling database using this research method, the resource utilization rate of the system decreases significantly, indicating that the lateral expansion effect of the geological drilling database using this method is good. According to SPSS 23.0 analysis, P < 0.05, which is statistically significant.

Table 2

System resource occupancy rate after horizontal expansion with different methods

Iterations/time System resource occupancy rate after horizontal expansion (%)
A [12] B [13] C [11] D (database of horizontal expansion methods)
50 54 61 68 8
100 57 64 73 6
150 65 58 67 3
200 68 47 43 9
250 72 67 63 12
300 78 77 56 5

5 Experimental results and discussion

Geological drilling data is one of the most important data at the level of state building. In the mining industry, a wide variety of geological data is used in the exploration and exploitation of deposits and mines, as well as in the management decision-making process. In order to manage geological data in an orderly manner, and to solve the problems of high resource consumption and slow retrieval efficiency of mass drilling databases due to the complexity and large size of the data, this research design proposes a database expansion method. Experiments are designed to compare the performance of three literature methods with the database expansion method of this work. In comparison, after the horizontal expansion of the geological database by the database expansion method, the database horizontal expansion method always has a higher capacity and the horizontal expansion effect is good. The retrieval efficiency of geological databases was mostly stable above 90%, while the retrieval efficiency of geological databases using other methods was between 40 and 65%. The system resource utilization decreased significantly, indicating that the horizontal expansion of the geological borehole database using this method was effective.

6 Conclusion

This study presents a lateral expansion method of a distributed geological drilling database for query optimization. It analyzes the characteristics of a distributed database, constructs a comprehensive geological data subtree, establishes a variety of constraint models, designs the lateral expansion method of the drilling database, and realizes the expansion optimization of the distributed geological drilling database through query optimization. The following conclusions are obtained through experiments: when the storage capacity of the geological database is 150 GB, the retrieval efficiency of this method is 89% after horizontal expansion of the geological database. After horizontal expansion, the system resource occupancy rate is only 5%. When the capacity of geological data is 80 GB, the capacity of the geological database is 41 × 105 TB. Experiments have been devised to compare the performance of the database expansion method of this study with the three literature methods. Upon comparison, it has been observed that the horizontal expansion of the geological database through the database expansion method has a consistently higher capacity, with a notable improvement in the horizontal expansion effect. The efficiency of retrieving geological databases remained consistently high at over 90%. In contrast, using other methods resulted in a retrieval efficiency of only 40–65%. The utilization of system resources decreased significantly, demonstrating the effectiveness of horizontally expanding the geological borehole database by this method. However, there are still some problems to be solved. The drilling data service platform supports data query and visualization, but there are few methods for online data analysis and utilization, and the types of data utilization are single. Future research will consider extending the data product to enhance support for global queries in a weak network connectivity mode.

Acknowledgments

This study was supported by Jiangmen City in 2019 basic and applied basic research key projects (No. Jiangke [2019]256).

  1. Funding information: This research received funding from Guangdong Provincial Education Science Planning Leading Group 2023 Education Science Planning Project (Special for Moral Education), “Research on Civics Classroom Teaching Evaluation Focusing on Key Teaching Behaviors under the Perspective of Artificial Intelligence” (Project No.: 2023JKDY083); and Jiangmen Basic and Applied Research’s main project for 2022, “Research on virtual reality multi-person collaborative interaction and efficient rendering technology for intelligent manufacturing under 5G environment” (project No. JZ202216).

  2. Conflict of interest: The author declares no conflict of interest.

  3. Data availability statement: The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

[1] Almayahi AZ, Ajeel ZA. Tectonic development of Nahr Umr oil field southeastern Iraq: inferences from seismic reflection and borehole data. Arab J Geosci. 2020;13(14):641.10.1007/s12517-020-05663-6Suche in Google Scholar

[2] Jiang X, Wang M, Liu F, Zhang Y. Visualization analysis of drilling and logging data in offshore oil and gas exploration. J Coast Res. 2020;106(SI):540–3.10.2112/SI106-122.1Suche in Google Scholar

[3] Heidari M, Nikolinakou MA, Flemings PB. Modified cam-clay model for large stress ranges and its predictions for geological and drilling processes. J Geophys Res Solid Earth. 2020;125(12):24–36.10.1029/2020JB019500Suche in Google Scholar

[4] Fillion MH, Hadjigeorgiou J. Quantifying influence of drilling additional boreholes on quality of geological model. Can Geotech J. 2019;14(43):56–67.10.1139/cgj-2017-0653Suche in Google Scholar

[5] Wu X, Lai X, Zhu J, Huang H, Chen L, Du S, et al. Intelligent control system design for electric-drive rig in complex geological drilling process. 2019 Chinese Control Conference (CCC); 2019.10.23919/ChiCC.2019.8866523Suche in Google Scholar

[6] Peters SE, Husson JM, Czaplewski J. Macrostrat: a platform for geological data integration and deep-time Earth crust research. Geochem Geophys Geosyst. 2018;19(4):1393–409.10.1029/2018GC007467Suche in Google Scholar

[7] Renaudie J, Lazarus D, Diver P. NSB (Neptune Sandbox Berlin): an expanded and improved database of marine planktonic microfossil data and deep-sea stratigraphy. Palaeontol Electron. 2020;23:1–28.10.26879/1032Suche in Google Scholar

[8] Kunnappilly A, Marinescu R, Seceleanu C. Statistical model checking for real-time database management systems: A case study. Leveraging Applications of Formal Methods, Verification and Validation. Verification. Cham: Springer; 2019.Suche in Google Scholar

[9] Van Dyke M, Klemetti T, Wickline J. Geologic data collection and assessment techniques in coal mining for ground control. Int J Min Sci Technol. 2020;30(1):131–9.10.1016/j.ijmst.2019.12.003Suche in Google Scholar PubMed PubMed Central

[10] Luo YM, Zhang DY. Research on frequent item data query of privacy framework optimized by extreme perturbation. Computer Simul. 2020;37(10):403–6.Suche in Google Scholar

[11] Ping W. Geological database for plate tectonic reconstruction: a conceptual model. Acta Geol Sin (Engl Ed). 2019;93(13):66–9.10.1111/1755-6724.14248Suche in Google Scholar

[12] Guo JT, Liu YH, Han YF, Wang XL. Implicit 3D geological modeling method for borehole data based on machine learning. Dongbei Daxue Xuebao/J Northeast Univ. 2019;40(9):1337–42.Suche in Google Scholar

[13] Mamoru T. Construction of 3D geological model using geotechnical information database. Geoinformatics. 2019;30(21):17–25.10.6010/geoinformatics.30.1_15Suche in Google Scholar

[14] Nengah S, Wayan A. The potential of liquefaction disasters based on the geological, CPT, and borehole data at Southern Bali Island. Istrazivanja i Projektovanja za Privredu. 2019;17(4):535–40.10.5937/jaes17-20794Suche in Google Scholar

[15] Gan C, Cao W, Wu M, Liu KZ, Chen X, Hu Y, et al. Two-level intelligent modeling method for the rate of penetration in complex geological drilling process. Appl Soft Comput. 2019;12(21):68–79.10.1016/j.asoc.2019.04.020Suche in Google Scholar

[16] Zhang C, Gholipour G, Mousavi AA. State-of-the-art review on responses of RC structures subjected to lateral impact loads. Arch Comput Methods Eng. 2021;28(4):2477–507.10.1007/s11831-020-09467-5Suche in Google Scholar

[17] Dong J, Deng R, Quanying Z, Cai J, Ding Y, Li M. Research on recognition of gas saturation in sandstone reservoir based on capture mode. Appl Radiat Isotopes. 2021;178:109939.10.1016/j.apradiso.2021.109939Suche in Google Scholar PubMed

[18] Geng Z, Wang Y. Physics-guided deep learning for predicting geological drilling risk of wellbore instability using seismic attributes data. Eng Geol. 2020;279(2):105–16.10.1016/j.enggeo.2020.105857Suche in Google Scholar

[19] Li YP, Cao WH, Hu WK, Wu M. Diagnosis of downhole incidents for geological drilling processes using multi-time scale feature extraction and probabilistic neural networks-ScienceDirect. Process Saf Environ Prot. 2020;137(23):106–15.10.1016/j.psep.2020.02.014Suche in Google Scholar

[20] Xie W, Li X, Jian W, Yang Y, Liu H, Robledo LF, et al. A novel hybrid method for landslide susceptibility mapping-based geodetector and machine learning cluster: A case of Xiaojin County, China. ISPRS Int J Geo-Inf. 2021;10(2):93.10.3390/ijgi10020093Suche in Google Scholar

[21] Zhang S, Zhang J, Ma Y, Pak RY. Vertical dynamic interactions of poroelastic soils and embedded piles considering the effects of pile-soil radial deformations. Soils Found. 2021;61(1):16–34.10.1016/j.sandf.2020.10.003Suche in Google Scholar

[22] Liu H, Shi Z, Li J, Liu C, Meng X, Du Y, et al. Detection of road cavities in urban cities by 3D ground-penetrating radar. Geophysics. 2021;86(3):A25–33.10.1190/geo2020-0384.1Suche in Google Scholar

[23] Liu T, Liu HB, Meng YF, Han X, Cui S, Yu A. Multi-coupling stress field and evaluation of borehole stability in deep brittle shale. Arab J Geosci. 2020;13:115621.10.1007/s12517-020-06152-6Suche in Google Scholar

Received: 2022-10-23
Revised: 2023-08-27
Accepted: 2023-09-21
Published Online: 2023-12-08

© 2023 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Artikel in diesem Heft

  1. Regular Articles
  2. Diagenesis and evolution of deep tight reservoirs: A case study of the fourth member of Shahejie Formation (cg: 50.4-42 Ma) in Bozhong Sag
  3. Petrography and mineralogy of the Oligocene flysch in Ionian Zone, Albania: Implications for the evolution of sediment provenance and paleoenvironment
  4. Biostratigraphy of the Late Campanian–Maastrichtian of the Duwi Basin, Red Sea, Egypt
  5. Structural deformation and its implication for hydrocarbon accumulation in the Wuxia fault belt, northwestern Junggar basin, China
  6. Carbonate texture identification using multi-layer perceptron neural network
  7. Metallogenic model of the Hongqiling Cu–Ni sulfide intrusions, Central Asian Orogenic Belt: Insight from long-period magnetotellurics
  8. Assessments of recent Global Geopotential Models based on GPS/levelling and gravity data along coastal zones of Egypt
  9. Accuracy assessment and improvement of SRTM, ASTER, FABDEM, and MERIT DEMs by polynomial and optimization algorithm: A case study (Khuzestan Province, Iran)
  10. Uncertainty assessment of 3D geological models based on spatial diffusion and merging model
  11. Evaluation of dynamic behavior of varved clays from the Warsaw ice-dammed lake, Poland
  12. Impact of AMSU-A and MHS radiances assimilation on Typhoon Megi (2016) forecasting
  13. Contribution to the building of a weather information service for solar panel cleaning operations at Diass plant (Senegal, Western Sahel)
  14. Measuring spatiotemporal accessibility to healthcare with multimodal transport modes in the dynamic traffic environment
  15. Mathematical model for conversion of groundwater flow from confined to unconfined aquifers with power law processes
  16. NSP variation on SWAT with high-resolution data: A case study
  17. Reconstruction of paleoglacial equilibrium-line altitudes during the Last Glacial Maximum in the Diancang Massif, Northwest Yunnan Province, China
  18. A prediction model for Xiangyang Neolithic sites based on a random forest algorithm
  19. Determining the long-term impact area of coastal thermal discharge based on a harmonic model of sea surface temperature
  20. Origin of block accumulations based on the near-surface geophysics
  21. Investigating the limestone quarries as geoheritage sites: Case of Mardin ancient quarry
  22. Population genetics and pedigree geography of Trionychia japonica in the four mountains of Henan Province and the Taihang Mountains
  23. Performance audit evaluation of marine development projects based on SPA and BP neural network model
  24. Study on the Early Cretaceous fluvial-desert sedimentary paleogeography in the Northwest of Ordos Basin
  25. Detecting window line using an improved stacked hourglass network based on new real-world building façade dataset
  26. Automated identification and mapping of geological folds in cross sections
  27. Silicate and carbonate mixed shelf formation and its controlling factors, a case study from the Cambrian Canglangpu formation in Sichuan basin, China
  28. Ground penetrating radar and magnetic gradient distribution approach for subsurface investigation of solution pipes in post-glacial settings
  29. Research on pore structures of fine-grained carbonate reservoirs and their influence on waterflood development
  30. Risk assessment of rain-induced debris flow in the lower reaches of Yajiang River based on GIS and CF coupling models
  31. Multifractal analysis of temporal and spatial characteristics of earthquakes in Eurasian seismic belt
  32. Surface deformation and damage of 2022 (M 6.8) Luding earthquake in China and its tectonic implications
  33. Differential analysis of landscape patterns of land cover products in tropical marine climate zones – A case study in Malaysia
  34. DEM-based analysis of tectonic geomorphologic characteristics and tectonic activity intensity of the Dabanghe River Basin in South China Karst
  35. Distribution, pollution levels, and health risk assessment of heavy metals in groundwater in the main pepper production area of China
  36. Study on soil quality effect of reconstructing by Pisha sandstone and sand soil
  37. Understanding the characteristics of loess strata and quaternary climate changes in Luochuan, Shaanxi Province, China, through core analysis
  38. Dynamic variation of groundwater level and its influencing factors in typical oasis irrigated areas in Northwest China
  39. Creating digital maps for geotechnical characteristics of soil based on GIS technology and remote sensing
  40. Changes in the course of constant loading consolidation in soil with modeled granulometric composition contaminated with petroleum substances
  41. Correlation between the deformation of mineral crystal structures and fault activity: A case study of the Yingxiu-Beichuan fault and the Milin fault
  42. Cognitive characteristics of the Qiang religious culture and its influencing factors in Southwest China
  43. Spatiotemporal variation characteristics analysis of infrastructure iron stock in China based on nighttime light data
  44. Interpretation of aeromagnetic and remote sensing data of Auchi and Idah sheets of the Benin-arm Anambra basin: Implication of mineral resources
  45. Building element recognition with MTL-AINet considering view perspectives
  46. Characteristics of the present crustal deformation in the Tibetan Plateau and its relationship with strong earthquakes
  47. Influence of fractures in tight sandstone oil reservoir on hydrocarbon accumulation: A case study of Yanchang Formation in southeastern Ordos Basin
  48. Nutrient assessment and land reclamation in the Loess hills and Gulch region in the context of gully control
  49. Handling imbalanced data in supervised machine learning for lithological mapping using remote sensing and airborne geophysical data
  50. Spatial variation of soil nutrients and evaluation of cultivated land quality based on field scale
  51. Lignin analysis of sediments from around 2,000 to 1,000 years ago (Jiulong River estuary, southeast China)
  52. Assessing OpenStreetMap roads fitness-for-use for disaster risk assessment in developing countries: The case of Burundi
  53. Transforming text into knowledge graph: Extracting and structuring information from spatial development plans
  54. A symmetrical exponential model of soil temperature in temperate steppe regions of China
  55. A landslide susceptibility assessment method based on auto-encoder improved deep belief network
  56. Numerical simulation analysis of ecological monitoring of small reservoir dam based on maximum entropy algorithm
  57. Morphometry of the cold-climate Bory Stobrawskie Dune Field (SW Poland): Evidence for multi-phase Lateglacial aeolian activity within the European Sand Belt
  58. Adopting a new approach for finding missing people using GIS techniques: A case study in Saudi Arabia’s desert area
  59. Geological earthquake simulations generated by kinematic heterogeneous energy-based method: Self-arrested ruptures and asperity criterion
  60. Semi-automated classification of layered rock slopes using digital elevation model and geological map
  61. Geochemical characteristics of arc fractionated I-type granitoids of eastern Tak Batholith, Thailand
  62. Lithology classification of igneous rocks using C-band and L-band dual-polarization SAR data
  63. Analysis of artificial intelligence approaches to predict the wall deflection induced by deep excavation
  64. Evaluation of the current in situ stress in the middle Permian Maokou Formation in the Longnüsi area of the central Sichuan Basin, China
  65. Utilizing microresistivity image logs to recognize conglomeratic channel architectural elements of Baikouquan Formation in slope of Mahu Sag
  66. Resistivity cutoff of low-resistivity and low-contrast pays in sandstone reservoirs from conventional well logs: A case of Paleogene Enping Formation in A-Oilfield, Pearl River Mouth Basin, South China Sea
  67. Examining the evacuation routes of the sister village program by using the ant colony optimization algorithm
  68. Spatial objects classification using machine learning and spatial walk algorithm
  69. Study on the stabilization mechanism of aeolian sandy soil formation by adding a natural soft rock
  70. Bump feature detection of the road surface based on the Bi-LSTM
  71. The origin and evolution of the ore-forming fluids at the Manondo-Choma gold prospect, Kirk range, southern Malawi
  72. A retrieval model of surface geochemistry composition based on remotely sensed data
  73. Exploring the spatial dynamics of cultural facilities based on multi-source data: A case study of Nanjing’s art institutions
  74. Study of pore-throat structure characteristics and fluid mobility of Chang 7 tight sandstone reservoir in Jiyuan area, Ordos Basin
  75. Study of fracturing fluid re-discharge based on percolation experiments and sampling tests – An example of Fuling shale gas Jiangdong block, China
  76. Impacts of marine cloud brightening scheme on climatic extremes in the Tibetan Plateau
  77. Ecological protection on the West Coast of Taiwan Strait under economic zone construction: A case study of land use in Yueqing
  78. The time-dependent deformation and damage constitutive model of rock based on dynamic disturbance tests
  79. Evaluation of spatial form of rural ecological landscape and vulnerability of water ecological environment based on analytic hierarchy process
  80. Fingerprint of magma mixture in the leucogranites: Spectroscopic and petrochemical approach, Kalebalta-Central Anatolia, Türkiye
  81. Principles of self-calibration and visual effects for digital camera distortion
  82. UAV-based doline mapping in Brazilian karst: A cave heritage protection reconnaissance
  83. Evaluation and low carbon ecological urban–rural planning and construction based on energy planning mechanism
  84. Modified non-local means: A novel denoising approach to process gravity field data
  85. A novel travel route planning method based on an ant colony optimization algorithm
  86. Effect of time-variant NDVI on landside susceptibility: A case study in Quang Ngai province, Vietnam
  87. Regional tectonic uplift indicated by geomorphological parameters in the Bahe River Basin, central China
  88. Computer information technology-based green excavation of tunnels in complex strata and technical decision of deformation control
  89. Spatial evolution of coastal environmental enterprises: An exploration of driving factors in Jiangsu Province
  90. A comparative assessment and geospatial simulation of three hydrological models in urban basins
  91. Aquaculture industry under the blue transformation in Jiangsu, China: Structure evolution and spatial agglomeration
  92. Quantitative and qualitative interpretation of community partitions by map overlaying and calculating the distribution of related geographical features
  93. Numerical investigation of gravity-grouted soil-nail pullout capacity in sand
  94. Analysis of heavy pollution weather in Shenyang City and numerical simulation of main pollutants
  95. Road cut slope stability analysis for static and dynamic (pseudo-static analysis) loading conditions
  96. Forest biomass assessment combining field inventorying and remote sensing data
  97. Late Jurassic Haobugao granites from the southern Great Xing’an Range, NE China: Implications for postcollision extension of the Mongol–Okhotsk Ocean
  98. Petrogenesis of the Sukadana Basalt based on petrology and whole rock geochemistry, Lampung, Indonesia: Geodynamic significances
  99. Numerical study on the group wall effect of nodular diaphragm wall foundation in high-rise buildings
  100. Water resources utilization and tourism environment assessment based on water footprint
  101. Geochemical evaluation of the carbonaceous shale associated with the Permian Mikambeni Formation of the Tuli Basin for potential gas generation, South Africa
  102. Detection and characterization of lineaments using gravity data in the south-west Cameroon zone: Hydrogeological implications
  103. Study on spatial pattern of tourism landscape resources in county cities of Yangtze River Economic Belt
  104. The effect of weathering on drillability of dolomites
  105. Noise masking of near-surface scattering (heterogeneities) on subsurface seismic reflectivity
  106. Query optimization-oriented lateral expansion method of distributed geological borehole database
  107. Petrogenesis of the Morobe Granodiorite and their shoshonitic mafic microgranular enclaves in Maramuni arc, Papua New Guinea
  108. Environmental health risk assessment of urban water sources based on fuzzy set theory
  109. Spatial distribution of urban basic education resources in Shanghai: Accessibility and supply-demand matching evaluation
  110. Spatiotemporal changes in land use and residential satisfaction in the Huai River-Gaoyou Lake Rim area
  111. Walkaway vertical seismic profiling first-arrival traveltime tomography with velocity structure constraints
  112. Study on the evaluation system and risk factor traceability of receiving water body
  113. Predicting copper-polymetallic deposits in Kalatag using the weight of evidence model and novel data sources
  114. Temporal dynamics of green urban areas in Romania. A comparison between spatial and statistical data
  115. Passenger flow forecast of tourist attraction based on MACBL in LBS big data environment
  116. Varying particle size selectivity of soil erosion along a cultivated catena
  117. Relationship between annual soil erosion and surface runoff in Wadi Hanifa sub-basins
  118. Influence of nappe structure on the Carboniferous volcanic reservoir in the middle of the Hongche Fault Zone, Junggar Basin, China
  119. Dynamic analysis of MSE wall subjected to surface vibration loading
  120. Pre-collisional architecture of the European distal margin: Inferences from the high-pressure continental units of central Corsica (France)
  121. The interrelation of natural diversity with tourism in Kosovo
  122. Assessment of geosites as a basis for geotourism development: A case study of the Toplica District, Serbia
  123. IG-YOLOv5-based underwater biological recognition and detection for marine protection
  124. Monitoring drought dynamics using remote sensing-based combined drought index in Ergene Basin, Türkiye
  125. Review Articles
  126. The actual state of the geodetic and cartographic resources and legislation in Poland
  127. Evaluation studies of the new mining projects
  128. Comparison and significance of grain size parameters of the Menyuan loess calculated using different methods
  129. Scientometric analysis of flood forecasting for Asia region and discussion on machine learning methods
  130. Rainfall-induced transportation embankment failure: A review
  131. Rapid Communication
  132. Branch fault discovered in Tangshan fault zone on the Kaiping-Guye boundary, North China
  133. Technical Note
  134. Introducing an intelligent multi-level retrieval method for mineral resource potential evaluation result data
  135. Erratum
  136. Erratum to “Forest cover assessment using remote-sensing techniques in Crete Island, Greece”
  137. Addendum
  138. The relationship between heat flow and seismicity in global tectonically active zones
  139. Commentary
  140. Improved entropy weight methods and their comparisons in evaluating the high-quality development of Qinghai, China
  141. Special Issue: Geoethics 2022 - Part II
  142. Loess and geotourism potential of the Braničevo District (NE Serbia): From overexploitation to paleoclimate interpretation
Heruntergeladen am 8.9.2025 von https://www.degruyterbrill.com/document/doi/10.1515/geo-2022-0554/html
Button zum nach oben scrollen