Startseite A new method for graph stream summarization based on both the structure and concepts
Artikel Open Access

A new method for graph stream summarization based on both the structure and concepts

  • Nosratali Ashrafi-Payaman , Mohammad Reza Kangavari EMAIL logo und Amir Mohammad Fander
Veröffentlicht/Copyright: 2. November 2019
Veröffentlichen auch Sie bei De Gruyter Brill

Abstract

Graph datasets are common in many application domains and for which their graphs are usually massive. One solution to process such massive graphs is summarization. There are two kinds of graphs, stationary and stream. For stationary graphs, a number of summarization algorithms are available while for graph stream there is no a comprehensive summarization method that summarizes a graph stream based on the structure, vertex attributes or both with varying contributions. This is because of challenges of graph stream, which are volume of data and changing of data over time. In this paper, we propose a method based on sliding-window model for which summarizes a graph stream based on a combination of the structure and vertex attributes. We proposed a new structure for summary graphs and also proposed new methods for comparing two summary graphs. To the best of our knowledge, this is the first method that summarizes a graph stream based on both the structure and vertex attributes with varying contributions. Through extensive experiments on real dataset of Amazon co-purchasing products, we have demonstrated the performance of the proposed method.

1 Introduction

Graph summarization is a useful and interesting topic that has been recently studied in the literature [1] widely. The general goal of summarization is to reduce a massive graph to a smaller one by removing unimportant details and preserving general properties of the graph. In a number of applications, data and their relations are modeled by a structural graph, e.g. cities and their ways. These graphs are summarized based on nodes and their relations [2, 3, 4]. On the other hand, a number of applications generate attributed graphs for which a number of attributes has been associated to vertices or even edges [5, 6] e.g. social networks such as Facebook. In Facebook, each node represents a person and has attributes such as name, family, country, and stuff. In general, summarization is performed by grouping similar nodes into one group and dissimilar nodes into different groups [1]. Similarity of two vertices can be structurally or attribute-based or both. For example in Facebook, both edges and vertex attributes can take into account for constructing summary. Therefore based on the important of the structure, vertex attributes or both, summarization will be structural [4], attribute-based [7, 8] or hybrid [9]. The similarity of the nodes has an important impact on the resultant summary and can be calculated based on vertex connectivity or vertex attributes or both. Therefore, the similarity criterion of two vertices specifies the type of the resultant summary. These days many application generate data which are received as a graph stream [10] such as selling products in supermarkets. For this example, the relationship between sold products are received as a stream of edges, an edge represents each pair of sold products. Although a number of algorithms have been proposed for summarizing stationery attributed graphs based on both the structure and vertex attributes with varying contributions of each [6, 8, 11], to the best of our knowledge there is no method capable of summarizing a graph stream based on both the structure and vertex attributes or both. This is the main challenge of graph stream summarization. In this paper, a new method has been proposed that addresses this challenge. The proposed method summarizes a graph stream based on sliding window paradigm. By using this proposed method, always a summary of the graph stream is available. We have provided experimental results on

Amazon product co-purchasing network dataset for evaluating the proposed method.

We propose a method for graph stream summarization based on sliding-window model. In this method a graph is summarized based on both the structure and vertex attributes. For comparing two summaries, we introduce a new schema for a summary graph and a new algorithm for calculating the difference between two summaries. In overall our contribution are as follows:

  • A new method for graph stream summarization

  • A new schema for a summary graph

  • A new algorithm for comparing two summaries and calculating their distance.

The rest of this paper is organized as follows. In Section 2, related works are reviewed. Section 3 is dedicated to the proposed method. Our experiments are explained in Section 4. Discussions are presented in Section 5 and finally we have provided conclusions and discussion of future work in Section 6.

2 Related Works

In this section, we review previous works on four different types of graph summarization to discuss the main challenges of graph summarization.

2.1 Structural summarization

In [12] a method has been proposed for graph structural summarization. In this method, a graph is compressed by partitioning similar nodes into one group and dissimilar nodes into different groups. For a compressed graph, a super-edge is the aggregated edges between a pair of supernodes. In this method, a graph is compressed based on the Minimum Description Length (MDL) idea. Firstly, they developed a greedy algorithm and secondly to reduce the runtime of the algorithm, they proposed a randomized version.

Riondato et al. [3] proposed another method to summarize structural graphs. In this method, the aim is to guarantee the quality of a summary and minimizing the reconstruction errors. Riondato et al. have presented a connection between graph summarization and geometric clustering problems for the first time. Based on this connection, the authors developed a polynomial-time algorithm to generate the best possible summary of the expected size.

Tian et al. [13] proposed three distributed summarization algorithms named DistGreedy, DistRandom and DistLSH to summarize large scale graphs. These algorithms differ in how they select a pair of nodes to merge, which they select greedy, randomly, and using locality sensitive hashing theory, respectively.

Chen et al. [14] proposed a method based on producing randomized summary graphs for identifying frequent patterns. Structural summaries can be beneficial for frequent pattern mining. In fact, instead of mining massive and time-consuming original graphs, summary graphs are mined.

In fact spectral graph clustering can be used for structural summarization. Spectral graph clustering partitions a graph based on eigenvalues and eigenvectors of the graph adjacency matrix [15, 16, 17, 18, 19]. This technique can be beneficial in image segmentation and social network analysis. There are a number of applications that use spectral graph clustering for finding communities in networks [20]. In this applications, initially a large graph is converted into a small one by summarization and then use spectral graph clustering to cluster the resultant small summary graph [21].

Community detection algorithms has many applications and recently, many articles [22, 23, 24, 25, 26] have been published on this subject. Graph summarization can be beneficial for detecting communities in a network.

Of-course, there are some similar methods/models [27, 28], subgraph mining models,which are limited in comparison with summarization methods. These models rather than summarizing, choose one or more subsets from graphs.

2.2 Attribute-based summarization

In [7], a summarization method with two novel operations Summarization by grouping Nodes on Attributes and Pairwise relations (SNAP) and k-SNAP has been proposed. These operations are used for grouping nodes and summarizing attributed graphs. Tian et al. defined attribute and relation compatible grouping. They also improved SNAP operation by proposing k-SNAP operation. In k-SNAP operation, k is the summary size where is determined by the user. The k-SNAP operation improved by Zhang et al. [8] by proposing the CANAL algorithm in 2009. The CANAL algorithm is used to categorize attribute values automatically, and also to provide a criterion to measure the quality of a summary.

In 2008, the OLAP framework has been proposed by Chen et al. [29]. In the OLAP framework, the cubes are created on the graph based on dimensions and measures. In this framework, a graph is summarized based on both selected attributes and input information.

Figure 1 Original graph (left) and its summary (right)
Figure 1

Original graph (left) and its summary (right)

2.3 Hybrid summarization

In [6] a method was proposed for clustering a graph based on both the structure and vertex attributes. In this method, for a given graph a new graph, named the augmented graph, with real and virtual links is constructed. Because of attribute-based similarity of vertices, the virtual links are added to the new graph. In the augmented graph, both real and virtual links are considered to measure the similarity of two nodes. If the number of associated attributes is relatively high, the augmented graph will be massive and finally the runtime of the algorithm is high.

In [11] another method has been proposed to hybrid summarization of a graph. In this method, initially a graph is summarized based on vertex attributes, without take into consideration the graph structure, and then by moving nodes between super-nodes adjust the summary to the graph structure. This method for situations where the structure has an important impact in constructing summary may be inefficient.

In [30] a method has been proposed for attributed graphs which constructs a hybrid summary by considering MDL principle to model the graph summarization problem into a code cost function and utilizing greedy method to compute an optimal summary. In this method, the user’s needs and also the ontology of the graph have not been considered.

2.4 Graph stream summarization

There has been some research work on the subject of graph stream summarization but the contribution of these works in the scope of graph stream summarization is not significant. Major research work done on graph stream are as follows.

In this [31] a novel Graph Stream Sketch (GSS) has been proposed to summarize graph streams with linear space cost (O(|E|) and constant update time complexity (O(1)). The aim of Gou et al. has been constructing a summary for query answering with the controllable errors.

In [32] the focus is on calculating the rank of a node in a graph stream with the minimum passes over the stream and the minimum space, of-course up to an adaptive error. Therefore, algorithms and models has presented in this regard.

In [33] Feigenbaum et al. have been interested in the trade-offs between model parameters such as perdata-item processing time, required space, and the required number of passes over the stream. These trade-offs have been considered for solving problems such as Spanner Construction, BFS-Tree Construction, Graph Distance Lower-Bounds.

In [34] a new variation of streaming model with a helper which can provide annotations for data streams have been proposed by Cormode et al. They have discussed that by giving linear sized annotations, the memory for many problems is reduced to constant time.

In [35] Feigenbaum proposed a new streaming model and formulized it. They believed this model is necessary for proposing efficient algorithms to solve problems on massive graphs. They have considered an upper bound for required spaces foe such algorithms. They applied the proposed model on special problems.

In [36] Aggarwal also et al. proposed a method for graph stream clustering by introducing micro-clusters and compressing them with hash functions. The proposed method can be beneficial for special applications. Aggarwal proposed a new method for classification a massive domain graph stream [37]. Aggarwal has proposed a probabilistic approach for constructing a summary that can be stored in main memory. Aggarwal used this method for determining special patterns in a graph stream.

There are other works on graph stream such as the problems of connectivity [35], counting subgraphs e.g. triangles [38, 39], calculating the degree of nodes [40], spanners [41], sparsification [42]. Thus to the best of our knowledge, there is no capable method for summarizing a graph stream which converts a graph to a smaller one by removing unnecessary details and preserving overall properties.

3 The proposed method

In the proposed method, we use sliding-window model for summarizing a graph stream. We take into account the edges of the first window over the graph stream and construct the graph of this window. This graph is summarized using hybrid summarization method proposed by the authors of this paper [9, 43] for summarizing an attributed graph based on both the structure and vertex attributes. The summary graph is maintained as a reference. We take into account edges of the second window over the graph stream and its graph is constructed and summarized. This summary is named current summary. The current summary is compared to the reference summary. Depending on whether they are matched or not, one of them is skipped. If they are matched then the current summary is skipped otherwise the distance of two summaries is higher than a given threshold. For the latter case, the current summary is maintained as the reference summary. In this case, the current summary is the best representative of the graph stream.

By continuing this trend, a summary of the graph stream is available at any moment. The paradigm of the new method for graph stream summarization depicted in Figure 2.

Figure 2 The proposed method for graph stream summarization
Figure 2

The proposed method for graph stream summarization

3.1 Algorithm

The proposed method has been summarized in Algorithm 1. In this algorithm, summarizing a graph, comparing two summary graphs and calculating the distance of two summary graphs are not clear and should be illustrated more. In the following subsections, we illustrate the structure of a summary graph, similarity of two supernodes, comparing two summary graphs and finally the proposed method is illustrated by an example.

Algorithm 1: Graph stream summarization

Input: A graph stream;

Output: A reference summary graph;

1: Begin

2: consider the window w0 on the graph stream;

3: Construct the graph of w0 and summarize it as SGR.

4: While(true)

5: consider the next window wc on the graph stream;

6: construct the graph of wc and summarize it as SGc.

7: compare two summary graphs and let d=dist(SGR, SGc);

8: if(d > threshold) replace SGR with SGc;

9: else ignore SGc.

10: Endwhile;

11: end.

3.2 The summary structure

In the proposed method, an attributed graph is summarized based on both the structure and vertex attributes. Every super-node in the summary graph is a vector of structural and semantical attributes. Structural attributes are the number of vertices in the super-node, the degree of the super-node and the percentage of vertices, which are relevant with nodes of the other super-nodes. Semantical attributes are considered as the percentage of vertices, which have a value on an attribute. In fact, for every value of a vertex attribute this percentage value is calculated. In Section 3.5 we illustrate the summary structure by an example.

3.3 Similarity of two super-nodes

Based on the proposed structure for the summary graph, a super-node is a vector of attributes and the similarity of two super-nodes is calculated based on their vectors. The similarity of two super-nodes is calculated using Equation (1),which also uses Equations (2) through (6). Initially, the distance of two super-nodes is calculated and then by subtracting this value from one, the similarity of two supernodes is obtained.

(1)simVp,Vq=1disVp,Vq

The distance between two super-nodes is equal to summation of structural and attribute-based distance of two super-nodes, which is presented by Equation (2).

(2)disVp,Vq=12disstVp,Vq+disattVp,Vq

The number of vertices, the degree of super-nodes and the number of vertices which relevant to vertices of other super-nodes (the weight of edges) are considered as structural attributes. These structural attributes determine the structural distance of two super-nodes. Equation (3) describes the structural distance of two super-nodes. As seen in Equation (3), the value of structural distance belongs to [0,1]. For each of the three parts of Equation (3), if the denominator is zero, the value of that part is considered to be zero.

(3)disstVp,Vq=13(npnq2n2max+dpdq2d2max+1maxd,di=1maxd,d(pep,ipeq,i)2)

The attribute-based distance between two super-nodes with k attributes and each attribute with k values is calculated using Equation (4).

(4)disattVp,Vq=1ki=1kdisattVp,i,Vq,i

where

(5)disattVp,i,Vq,i=1kj=1kdisattVp,i,j,Vq,i,j2

where

(6)disattVp,i,j,Vq,i,j=perp,i,jperq,i,j

where np and dq are the number of vertices in Vp and the degree of Vp, respectively.

3.4 Comparing two summary graphs

To compute the distance of two summary graphs, initially the similarity of each pair of super-nodes of two summary graphs is calculated using Equation (1). The super-node pairs with the most similarity are associated. After associating the super-nodes of two summary graphs, the distance of two summary graphs is calculated. The distance is equal to summation of distances of each pair of matched super-nodes. The approach for calculating the distance of two summary graphs has been described in Algorithm 2.

Algorithm 2: Calculating distance two summary graphs

Input: summary graphs: GS1 and GS2 and the size of summary graph: size;

Output: distance of two summaries;

1: Begin

2: Calculate the distance of every two super-nodes

3: Add every super-node pair with its calculated distance to ascending priority queue q;

4: n=summary graph size;

5: While(n>0)

6: Remove a super-node pair;

7: Match two super-node of the pair;

8: n–;

9: endwhile;

10: set dsit to sum of distances of the matched super-node pairs;

11: end.

3.5 Illustrating the proposed method with an example

To clarify the issue, we consider two summary graphs SG1 and SG2 with above-mentioned structure, each with three super-nodes and two attributes. Attributes are gender and education level, gender with values of Male and Female and education level with values of BSc., MSc. and Ph.D. As we see in Figure 3, the summary graph shows the overall and important information of the original graph. For example, super-node V1 shows a group of 600 people where 20% are in relationship with people of V2, 80% are in relationship with people of V3, 65% are female and 35% are male, 30% are bachelor of science, 30% are master of science and 30% are Doctor of Philosophy.

Figure 3 First summary with three super-nodes and two attributes. Attribute values, size of super-nodes and the percentage of vertices, which are in relationship to vertices of the other super-nodes, are shown in the summary.
Figure 3

First summary with three super-nodes and two attributes. Attribute values, size of super-nodes and the percentage of vertices, which are in relationship to vertices of the other super-nodes, are shown in the summary.

Figure 4 Second summary with three super-nodes and two attributes. Attribute values, size of super-nodes and the percentage of vertices, which are in relationship to vertices of the other supernodes, shown in the summary.
Figure 4

Second summary with three super-nodes and two attributes. Attribute values, size of super-nodes and the percentage of vertices, which are in relationship to vertices of the other supernodes, shown in the summary.

As already mentioned, initially the similarity of every pair of super-nodes of two summary graphs is calculated. For clarify, in the following we calculate the similarity of two super-nodes V1 and V1,step-by-step.

disstV1,V1=13(60065026502+22222+1max2,2i=1max2,2(pep,ipeq,i)2)=13(5026502+0+120.90.82+0.20.12)=135×565×65+0+120.012+0.012=135×565×65+0+120.01+0.01=13113×13+0+0.01=0.0053
disattV1,V1=12(120.650.552+0.350.452+13(0.30.22+0.30.352+0.40.452))=12(120.01+0.01+130.01+0.0025+0.0025)=12(0.01+130.01+0.0025+0.0025)=120.01+0.005=0.0075
disattV1,V1=12(0.0053+0.0075)=0.0064simattV1,V1=0.9936

For saving time, we have refused to provide computational steps for other super-nodes pairs and only have entered their final similarity values in Table 1.

Table 1

The similarity of super-nodes

V1V2V3
V10.98990.95220.9762
V20.97130.99450.9755
V30.96020.97830.9920

Based on Table 1, matched super-nodes are (V2,V2)(V3,V3)and(V1,V1)respectively. The first component of each pair is a super-node of the first summary graph and the second component is its matched super-node of the second summary graph. According to this matching, the distance of two summary graphs SG1 and SG2 is calculated as follows:

disSG1,SG2=disV1,V1+disV2,V2+disV3,V3=0.9899+0.9945+0.9920=2.9764

4 Experiment

In this section, we conducted experiments to evaluate the performance of the proposed method on real-world graphs. The proposed method was implemented in Java programming language.

4.1 Dataset

Amazon co-purchasing network

This data is available in address[1] and includes information about different products such as the books, music CDs, DVDs and VHS video tapes. There are 548,552 products and for each product, the information such as title, salesrank, list of similar products, category and reviews is available. This data are about Amazon co-purchasing products of 2003 and has been collected in summer 2006 by Jure Leskovec with crawling Amazon website. The information of this products and their graph streams are presented in Table 2. Rows second to fifth show four directed graph streams. Each graph is a graph stream where each edge (x, y) shows product y has frequently co-purchased with product x. We chose Id, ASIN, group and salesrank fields for providing experiments.

Table 2

The information of Amazon co-purchasing network

Name#Nodes#EdgesDuration
amazon0302262,1111,234,877Amazon product co-purchasing network from March 2 2003
amazon0312400,7273,200,440Amazon product co-purchasing network from March 12 2003
amazon0505410,2363,356,824Amazon product co-purchasing network from May 5 2003
amazon0601403,3943,387,388Amazon product co-purchasing network from June 1 2003
amazon-meta548,5521,788,725Amazon product metadata: product info and all reviews on around 548,552 products

4.2 Evaluation

To the best of our knowledge, our proposed method is a novel general-purpose method for graph stream summarization and there is no competitor method for exact evaluation. We believed that comparing the results of our proposed method with the changes of real constructed graphs are more reasonable and reliable than comparing to other competitor methods.

Therefore, for evaluating the proposed method for graph stream summarization, we chose amazon0302 file and set the window size to 1000 edges. We considered the first window over the first 1000 edges of the file and constructed the first graph. For every window, the vertices are those, which are appeared at least as one end of the first 5000 edges. The graph of the first window has been summarized and resulted a summary graph with the size of 7. The summary graph is maintained as a reference. The next windows are also considered over the graph stream and in the following tasks such as summarizing graphs, comparing every coming graph with the reference summary and changing the reference summary(if necessary) are done. In this experiment, window size was fixed, 1000 edges, but usually the first window is considered bigger than the others are. In the following, 5 summary graphs are presented in Figures 5 through 9. In these Figures, dis and toy represent discontinued and toy products. These two categories do not belong to the main categories which are mentioned in the description of the dataset.

Figure 5 The first summary graph of the size of 7
Figure 5

The first summary graph of the size of 7

The semantic of each summary graph has been extracted and shown in Table 3. Semantic changes of two consecutive summary graphs are presented in Table 4. Figures 5 through 9 and Tables 3 and 4 are used to discuss the experimental results.

Table 3

semantic of the summary graphs

rowSummary graphSummary graph semantic
1Figures 5Discontinued products are related with books. The majority of books are related to themselves. The majority of the super-nodes are isolated.
2Figures 6Discontinued products are related with all other products. Super-nodes are related to each other (a near clique). Only toy the super-node of Toy is isolated.
3Figures 7There is no an isolated super-node. All super-nodes are in relationship with the super-node of book. There is 4-clique in the graph.
4Figures 8There is no an isolated super-node. All super-nodes are in relationship with book super-node. The number of sold music products is maximal.
5Figures 9There is no category of toy in the summary graph All super-nodes are in relationship with book super-node. The majority of books are in relationship with each other.
Table 4

semantic changes of the consecutive summary graphs

rowSummary graphsdistanceSemantic changes
1Figures 50.32612The graph become denser
Figures 6Discontinued products have been sold with other products.
2Figures 60.4389434isolated super-node.
Figures 74-clique
3Figures 70.140722334-clique.
Figures 8
4Figures 80.36400673Toy category
Figures 9The number of edges
Figure 6 The second summary graph of the size of 7
Figure 6

The second summary graph of the size of 7

Figure 7 The third summary graph of the size of 7
Figure 7

The third summary graph of the size of 7

Figure 8 The fourth summary graph of the size of 7
Figure 8

The fourth summary graph of the size of 7

Figure 9 The fifth summary graph of the size of 7
Figure 9

The fifth summary graph of the size of 7

4.3 Time complexity

In the proposed method, the dominant time belongs to the summarization algorithm. According to summarization algorithm in [9, 43], initially the similarity of each pair of nodes is calculated and after that the graph is summarized by merging nodes/super-nodes. In the worst case, the summarization algorithm performs at most |V| merge operations to obtain the expected summary. Henceforth, the time complexity of this method is O(|E|×|V|). Time complexity of other processes such as calculating distance between two super-nodes, matching super-nodes of two summary graphs and finally calculating the distance of two summary graphs is less than the runtime of summarization algorithm.

5 Discussions

The summary graphs as shown in Figures 5 through 9, the semantic of each summary as shown in Table 3 and distance of every two summary graphs as shown in Table 4, help us to justify distance of every two consecutive summary graphs intuitively. In fact, the calculated distances based on above-mentioned formulas should be supported by intuitive structural and semantical changes.

As shown in Table 4, the first and second summary graphs have distance value of about 0.3 and intuitively these two summary graphs differ in two cases as shown in the fourth column. Therefore, the distance of these two summary graphs is supported by the intuitive changes. Such a situation can be seen for the second and third summary graphs. On the other hand, the third and fourth summary graphs have a lower distance in comparison to the previous consecutive summary graphs and this is also in line with the intuitive changes. Third and fourth summary graphs differ intuitively only in one case, the existence or absence of 4-clique. The situation for the fourth and fifth summary graphs is similar to consecutive summary graphs of the first through third. Therefore, the calculated distance for each pair of consecutive summary graphs is according to intuitive distance of summary graphs and it is reliable.

By setting the threshold value of the distance between two summary graphs, it is determined whether the reference needs to be changed or not. For example, if we set the threshold to 0.5 then the first summary graph is remained as the reference. On the other hand, if we set the threshold to 0.2 then initially the reference will be the first summary graph, with the appearance of the second graph, this new summary graph will be the reference summary graph. This will also happen for the third summary graph, and third summary graph will be the reference. With the appearance of the fourth summary graph, the reference summary does not change. With the advent of the fifth summary graph, the reference will change and it will be replaced with the fifth summary graph. The threshold value can be determined in terms of scope and precision.

It is obvious that our proposed method is a generalpurpose method, because of taking into account the struc ture, vertex attributes, user’s needs and graph ontology in summarization. Hence, the summary graph can be beneficial in community detection, node degree calculations and stuff. By initially setting parameters in summarization algorithm, it is possible to change the orientation of the summarization algorithm.

6 Conclusions

In this paper, we proposed a method for graph stream summarization based on the sliding-window model. In the proposed method, a graph is summarized based on both the structure and vertex attributes. The super-nodes of two summary graphs are matched to each other and the distance of every pair of matched super-nodes is calculated. The distance of two summary graphs is calculated based on the calculated distances of the matched super-nodes. If the difference of the new summary graph and the reference is higher than the threshold, then the reference will be replaced with the new summary graph.

To the best of our knowledge, this is the first method for summarizing a graph stream based on both the structure and vertex attributes. In this way, always the summary of the graph stream is available. The summary graph is a representative of the graph stream, which has the overall properties of the graph stream and can be used for decision-making. In this paper, a number of algorithms have been proposed for calculating the similarity of two super-nodes, matching super-nodes and calculating the distance of two summary graphs.

In the proposed method, the window size was chose fixed while considering windows of varying length is more logical. We plan to extend the proposed method by consider windows of varying length and learning the size of window algorithmically.

In real-world applications, some of data are missed and this issue should be considered in providing experiments. On the other hand, a number of applications generate more than one graph streams and these graph streams should be summarized simultaneously. A future research venue would be summarizing multiple graph streams.

References

[1] Liu Y., Safavi T., Dighe A., Koutra D., Graph Summarization Methods and Applications: A Survey, ACM Comput. Surv. 2018, 51.3, p. 62.10.1145/3186727Suche in Google Scholar

[2] Navlakha S., Rastogi R., Shrivastava N., Graph summarization with bounded error, in Proceedings of the 2008 ACM SIGMOD international conference on Management of data - SIGMOD ’08 2008, p. 419.10.1145/1376616.1376661Suche in Google Scholar

[3] Riondato M., García-Soriano D., Bonchi F., Graph summarization with quality guarantees, Data Min. Knowl. Discov. 2017, 31. 2, 314–349.10.1109/ICDM.2014.56Suche in Google Scholar

[4] LeFevre K. , Terzi E., GraSS: Graph Structure Summarization, in Proceedings of 2010 SIAM International Conference on Data Mining 2013, 454–465.10.1137/1.9781611972801.40Suche in Google Scholar

[5] Basu-Roy S., Eliassi-Rad T., Papadimitriou S., Fast and Effective Pattern Matching on Weighted Attributed Graphs, ACM Knowl. Discov. Data Min. 2013.Suche in Google Scholar

[6] Cheng H., Zhou Y., Yu J. X., Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities, ACM Trans. Knowl. Discov. Data 2011, 5.2, 12:1-12:33.10.1145/1921632.1921638Suche in Google Scholar

[7] Tian Y., Hankins R. A., Patel J. M., Efficient aggregation for graph summarization, in Proceedings of the 2008 ACM SIGMOD international conference on Management of data - SIGMOD ’08 2008, 567–80.10.1145/1376616.1376675Suche in Google Scholar

[8] Zhang N., Tian Y., Patel J. M., Discovery-driven graph summarization, in 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010) 2010, 880–891.10.1109/ICDE.2010.5447830Suche in Google Scholar

[9] Ashrafi-Payaman N. , Kangavari M. R., GSSC: Graph summarization based on both structure and concepts, Int. J. Inf. Commun. Technol. Res. 2017, 9.1, 33–44.10.1515/eng-2019-0060Suche in Google Scholar

[10] McGregor A., Graph Stream Algorithms: A Survey, ACM SIGMOD Rec. 2014, 43.1, 9–20,.10.1145/2627692.2627694Suche in Google Scholar

[11] Bei Y., Lin Z., Chen D., Summarizing scale-free networks based on virtual and real links, Phys. A Stat. Mech. its Appl. 2016, 444, 360–372.10.1016/j.physa.2015.08.048Suche in Google Scholar

[12] Navlakha S., Rastogi R., Shrivastava N., Graph summarization with bounded error, in Proceedings of the 2008 ACM SIGMOD international conference on Management of data - SIGMOD ’08 2008, 419–432.10.1145/1376616.1376661Suche in Google Scholar

[13] Liu X., Tian Y., He Q., Lee W. C., McPherson J., Distributed Graph Summarization, in Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management - CIKM ’14 2014, 799–808.10.1145/2661829.2661862Suche in Google Scholar

[14] Chen C. ,Lin C., Mining graph patterns efficiently via randomized summaries, in Proceedings of the VLDB Endowment 2.1 2009, 2.1 742–753.10.14778/1687627.1687711Suche in Google Scholar

[15] Von Luxburg U., A tutorial on spectral clustering, Stat. Comput. 2007, 17.4, 395–416,.10.1007/s11222-007-9033-zSuche in Google Scholar

[16] Dhillon I. S., Guan Y., Kulis B., A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts 2004.10.1145/1014052.1014118Suche in Google Scholar

[17] Auffarth B., Spectral Graph Clustering, Univ. Barcelona course Rep. Tech. Av. Aprendizaj Univ. Politec. Catalunya 2007.Suche in Google Scholar

[18] Uw S., Ng A. Y., Jordan M. I., Weiss Y., On spectral clustering: Analysis and an algorithm, Adv. Neural Inf. Process. Syst. 2002, 14, 849–856.Suche in Google Scholar

[19] Zhou D. , Burges C. J. C., Spectral clustering and transductive learning with multiple views, in Proceedings of the 24th international conference on Machine learning - ICML ’07 2007, 1159–1166.10.1145/1273496.1273642Suche in Google Scholar

[20] Smyth S. , White S., A spectral clustering approach to finding communities in graphs, in Proceedings of the 5th SIAM International Conference on Data Mining 2005, 274–285.10.1137/1.9781611972757.25Suche in Google Scholar

[21] Liu J.,Wang C., Danilevsky M., Han J., Large-scale spectral clustering on graphs, in IJCAI International Joint Conference on Artificial Intelligence 2013.Suche in Google Scholar

[22] Wang C.-D., Lai J. H., Yu P. S., Dynamic Community Detection in Weighted Graph Streams, in Proceedings of the 2013 SIAM International Conference on Data Mining 2013, 151–161.10.1137/1.9781611972832.17Suche in Google Scholar

[23] Arts G., Member S., Overlapping Community Detection Algorithms in Dynamic Networks: An Overview, Int. J. Emerg. Technol. Comput. Appl. Sci. 2013.Suche in Google Scholar

[24] Lancichinetti A. , Fortunato S., Community detection algorithms: A comparative analysis,Phys. Rev. E- Stat. Nonlinear, SoftMatter Phys. 2009, 80.5, 1–11.10.1103/PhysRevE.80.056117Suche in Google Scholar PubMed

[25] WangW., StreetW. N., A novel algorithm for community detection and influence ranking in social networks, ASONAM 2014 - Proc. 2014 IEEE/ACM Int. Conf. Adv. Soc. Networks Anal. Min. 2014, 555–560.10.1109/ASONAM.2014.6921641Suche in Google Scholar

[26] Benyahia O., Largeron C., Jeudy B., Community detection in dynamic graphswith missing edges, in Proceedings of International Conference on Research Challenges in Information Science 2017, 372–381.10.1109/RCIS.2017.7956562Suche in Google Scholar

[27] Hosseini S., Yin H., Zhang M., Elovici Y., Zhou X., Mining subgraphs from propagation networks through temporal dynamic analysis, in Proceedings of IEEE International Conference on Mobile Data Management 2018, 66–75.10.1109/MDM.2018.00023Suche in Google Scholar

[28] Hosseini S., Yin H., Cheung N. M., Leng K. P., Elovici Y., Zhou X., Exploiting reshaping subgraphs from bilateral propagation graphs, in International Conference on Database Systems for Advanced Applications 2018, 342–351.10.1007/978-3-319-91452-7_23Suche in Google Scholar

[29] Chen C., Yan X., Zhu F., Han J., Yu P. S., Graph OLAP: Towards online analytical processing on graphs, in Proceedings - IEEE International Conference on Data Mining, ICDM 2008, 103–112.10.1109/ICDM.2008.30Suche in Google Scholar

[30] Wu Y., Zhong Z., Xiong W., Jing N., Graph summarization for attributed graphs, in Proceedings - 2014 International Conference on Information Science, Electronics and Electrical Engineering, ISEEE 2014 2014, 503–507.10.1109/InfoSEEE.2014.6948163Suche in Google Scholar

[31] Gou X., Zou L., Zhao C., Yang T., Fast and Accurate Graph Stream Summarization, in IEEE 35th International Conference on Data Engineering (ICDE). IEEE 2019, 1118–1129.10.1109/ICDE.2019.00103Suche in Google Scholar

[32] Das Sarma A., Gollapudi S., Panigrahy R., Estimating PageRank on graph streams, J. ACM 2011, 58.3, p. 13.10.1145/1376916.1376928Suche in Google Scholar

[33] Feigenbaum J., Kannan S., McGregor A., Suri S., Zhang J., Graph Distances in the Data-Stream Model, SIAM J. Comput. vol. 38, no. 5, pp. 1709–1727.10.1137/070683155Suche in Google Scholar

[34] Cormode G., Mitzenmacher M., Thaler J., Streaming graph computations with a helpful advisor, Algorithmica 2013, 65.2, 409–442.10.1007/978-3-642-15775-2_20Suche in Google Scholar

[35] Feigenbaum J., Kannan S., Mcgregor A., Suri S., Zhang J., On graph problems in a semi-streaming model, , 2005, 348. 2, 207–216.10.1016/j.tcs.2005.09.013Suche in Google Scholar

[36] Aggarwal C. C., Zhao Y., Yu P. S., On Clustering Graph Streams, in Proceedings of the 2010 SIAM International Conference on Data Mining 2010, 478–489.10.1137/1.9781611972801.42Suche in Google Scholar

[37] Aggarwal C. C., On Classification of Graph Streams, in Proceedings of the 2011 SIAM International Conference on Data Mining 2013, 652–663.10.1137/1.9781611972818.56Suche in Google Scholar

[38] Tsourakakis C. E., Kang U., Miller G. L., Faloutsos C., DOULION: Counting Triangles in Massive Graphs with a Coin, in KDD ’09: 15th International Conference on Knowledge Discovery and Data Mining 2009, 837–846.10.1145/1557019.1557111Suche in Google Scholar

[39] Braverman V., Ostrovsky R., Vilenchik D., How hard is counting triangles in the streaming model?, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2013, vol. 7965 LNCS, no. PART 1, 244–254.10.1007/978-3-642-39206-1_21Suche in Google Scholar

[40] Cormode G., Muthukrishnan S., An Improved Data-Stream Summary. The Count-min Sketch and its Applications, J. Algorithms 2005, 55.1, 58–75.10.1007/978-3-540-24698-5_7Suche in Google Scholar

[41] Ahn K. J., Guha S., McGregor A., Graph sketches: sparsification, spanners, and subgraphs, in Proceedings of the 31st symposium on Principles of Database Systems - PODS ’12 2012, 5–14.10.1145/2213556.2213560Suche in Google Scholar

[42] Ahn K. J., Guha S., Graph sparsification in the semi-streaming model, in In International Colloquium on Automata, Languages, and Programming 2009, 328–338.10.1007/978-3-642-02930-1_27Suche in Google Scholar

[43] Ashrafi-Payaman N., Kangavari M. R., Graph Hybrid Summarization, 2018, 6.2, 335–340.Suche in Google Scholar

Received: 2018-12-13
Accepted: 2019-11-11
Published Online: 2019-11-02

© 2019 N. Ashrafi-Payaman et al., published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Artikel in diesem Heft

  1. Regular Article
  2. Exploring conditions and usefulness of UAVs in the BRAIN Massive Inspections Protocol
  3. A hybrid approach for solving multi-mode resource-constrained project scheduling problem in construction
  4. Identification of geodetic risk factors occurring at the construction project preparation stage
  5. Multicriteria comparative analysis of pillars strengthening of the historic building
  6. Methods of habitat reports’ evaluation
  7. Effect of material and technological factors on the properties of cement-lime mortars and mortars with plasticizing admixture
  8. Management of Innovation Ecosystems Based on Six Sigma Business Scorecard
  9. On a Stochastic Regularization Technique for Ill-Conditioned Linear Systems
  10. Dynamic safety system for collaboration of operators and industrial robots
  11. Assessment of Decentralized Electricity Production from Hybrid Renewable Energy Sources for Sustainable Energy Development in Nigeria
  12. Seasonal evaluation of surface water quality at the Tamanduá stream watershed (Aparecida de Goiânia, Goiás, Brazil) using the Water Quality Index
  13. EFQM model implementation in a Portuguese Higher Education Institution
  14. Assessment of direct and indirect effects of building developments on the environment
  15. Accelerated Aging of WPCs Based on Polypropylene and Plywood Production Residues
  16. Analysis of the Cost of a Building’s Life Cycle in a Probabilistic Approach
  17. Implementation of Web Services for Data Integration to Improve Performance in The Processing Loan Approval
  18. Rehabilitation of buildings as an alternative to sustainability in Brazilian constructions
  19. Synthesis Conditions for LPV Controller with Input Covariance Constraints
  20. Procurement management in construction: study of Czech municipalities
  21. Contractor’s bid pricing strategy: a model with correlation among competitors’ prices
  22. Control of construction projects using the Earned Value Method - case study
  23. Model supporting decisions on renovation and modernization of public utility buildings
  24. Cements with calcareous fly ash as component of low clinker eco-self compacting concrete
  25. Failure Analysis of Super Hard End Mill HSS-Co
  26. Simulation model for resource-constrained construction project
  27. Getting efficient choices in buildings by using Genetic Algorithms: Assessment & validation
  28. Analysis of renewable energy use in single-family housing
  29. Modeling of the harmonization method for executing a multi-unit construction project
  30. Effect of foam glass granules fillers modification of lime-sand products on their microstructure
  31. Volume Optimization of Solid Waste Landfill Using Voronoi Diagram Geometry
  32. Analysis of occupational accidents in the construction industry with regards to selected time parameters
  33. Bill of quantities and quantity survey of construction works of renovated buildings - case study
  34. Cooperation of the PTFE sealing ring with the steel ball of the valve subjected to durability test
  35. Analytical model assessing the effect of increased traffic flow intensities on the road administration, maintenance and lifetime
  36. Quartz bentonite sandmix in sand-lime products
  37. The Issue of a Transport Mode Choice from the Perspective of Enterprise Logistics
  38. Analysis of workplace injuries in Slovakian state forestry enterprises
  39. Research into Customer Preferences of Potential Buyers of Simple Wood-based Houses for the Purpose of Using the Target Costing
  40. Proposal of the Inventory Management Automatic Identification System in the Manufacturing Enterprise Applying the Multi-criteria Analysis Methods
  41. Hyperboloid offset surface in the architecture and construction industry
  42. Analysis of the preparatory phase of a construction investment in the area covered by revitalization
  43. The selection of sealing technologies of the subsoil and hydrotechnical structures and quality assurance
  44. Impact of high temperature drying process on beech wood containing tension wood
  45. Prediction of Strength of Remixed Concrete by Application of Orthogonal Decomposition, Neural Analysis and Regression Analysis
  46. Modelling a production process using a Sankey diagram and Computerized Relative Allocation of Facilities Technique (CRAFT)
  47. The feasibility of using a low-cost depth camera for 3D scanning in mass customization
  48. Urban Water Infrastructure Asset Management Plan: Case Study
  49. Evaluation the effect of lime on the plastic and hardened properties of cement mortar and quantified using Vipulanandan model
  50. Uplift and Settlement Prediction Model of Marine Clay Soil e Integrated with Polyurethane Foam
  51. IoT Applications in Wind Energy Conversion Systems
  52. A new method for graph stream summarization based on both the structure and concepts
  53. “Zhores” — Petaflops supercomputer for data-driven modeling, machine learning and artificial intelligence installed in Skolkovo Institute of Science and Technology
  54. Economic Disposal Quantity of Leftovers kept in storage: a Monte Carlo simulation method
  55. Computer technology of the thermal stress state and fatigue life analysis of turbine engine exhaust support frames
  56. Statistical model used to assessment the sulphate resistance of mortars with fly ashes
  57. Application of organization goal-oriented requirement engineering (OGORE) methods in erp-based company business processes
  58. Influence of Sand Size on Mechanical Properties of Fiber Reinforced Polymer Concrete
  59. Architecture For Automation System Metrics Collection, Visualization and Data Engineering – HAMK Sheet Metal Center Building Automation Case Study
  60. Optimization of shape memory alloy braces for concentrically braced steel braced frames
  61. Topical Issue Modern Manufacturing Technologies
  62. Feasibility Study of Microneedle Fabrication from a thin Nitinol Wire Using a CW Single-Mode Fiber Laser
  63. Topical Issue: Progress in area of the flow machines and devices
  64. Analysis of the influence of a stator type modification on the performance of a pump with a hole impeller
  65. Investigations of drilled and multi-piped impellers cavitation performance
  66. The novel solution of ball valve with replaceable orifice. Numerical and field tests
  67. The flow deteriorations in course of the partial load operation of the middle specific speed Francis turbine
  68. Numerical analysis of temperature distribution in a brush seal with thermo-regulating bimetal elements
  69. A new solution of the semi-metallic gasket increasing tightness level
  70. Design and analysis of the flange-bolted joint with respect to required tightness and strength
  71. Special Issue: Actual trends in logistics and industrial engineering
  72. Intelligent programming of robotic flange production by means of CAM programming
  73. Static testing evaluation of pipe conveyor belt for different tensioning forces
  74. Design of clamping structure for material flow monitor of pipe conveyors
  75. Risk Minimisation in Integrated Supply Chains
  76. Use of simulation model for measurement of MilkRun system performance
  77. A simulation model for the need for intra-plant transport operation planning by AGV
  78. Operative production planning utilising quantitative forecasting and Monte Carlo simulations
  79. Monitoring bulk material pressure on bottom of storage using DEM
  80. Calibration of Transducers and of a Coil Compression Spring Constant on the Testing Equipment Simulating the Process of a Pallet Positioning in a Rack Cell
  81. Design of evaluation tool used to improve the production process
  82. Planning of Optimal Capacity for the Middle-Sized Storage Using a Mathematical Model
  83. Experimental assessment of the static stiffness of machine parts and structures by changing the magnitude of the hysteresis as a function of loading
  84. The evaluation of the production of the shaped part using the workshop programming method on the two-spindle multi-axis CTX alpha 500 lathe
  85. Numerical Modeling of p-v-T Rheological Equation Coefficients for Polypropylene with Variable Chalk Content
  86. Current options in the life cycle assessment of additive manufacturing products
  87. Ideal mathematical model of shock compression and shock expansion
  88. Use of simulation by modelling of conveyor belt contact forces
Heruntergeladen am 9.9.2025 von https://www.degruyterbrill.com/document/doi/10.1515/eng-2019-0060/html
Button zum nach oben scrollen