research-article

GraphMineSuite: enabling high-performance and programmable graph mining algorithms with set algebra

Authors Info & Claims
Published:01 July 2021Publication History
Skip Abstract Section

Abstract

We propose GraphMineSuite (GMS): the first benchmarking suite for graph mining that facilitates evaluating and constructing high-performance graph mining algorithms. First, GMS comes with a benchmark specification based on extensive literature review, prescribing representative problems, algorithms, and datasets. Second, GMS offers a carefully designed software platform for seamless testing of different fine-grained elements of graph mining algorithms, such as graph representations or algorithm subroutines. The platform includes parallel implementations of more than 40 considered baselines, and it facilitates developing complex and fast mining algorithms. High modularity is possible by harnessing set algebra operations such as set intersection and difference, which enables breaking complex graph mining algorithms into simple building blocks that can be separately experimented with. GMS is supported with a broad concurrency analysis for portability in performance insights, and a novel performance metric to assess the throughput of graph mining algorithms, enabling more insightful evaluation. As use cases, we harness GMS to rapidly redesign and accelerate state-of-the-art baselines of core graph mining problems: degeneracy reordering (by >2X), maximal clique listing (by >9×), k-clique listing (by up to 1.1×), and subgraph isomorphism (by 2.5×), also obtaining better theoretical performance bounds.

References

  1. C. R. Aberger, A. Lamb, S. Tu, A. Nötzli, K. Olukotun, and C. Ré. Emptyheaded: A relational engine for graph processing. ACM Transactions on Database Systems (TODS), 42(4):1--44, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Adedoyin-Olowe, M. M. Gaber, and F. Stahl. A survey of data mining techniques for social media analysis. arXiv preprint arXiv:1312.4617, 2013.Google ScholarGoogle Scholar
  3. C. C. Aggarwal and H. Wang. Graph data management and mining: A survey of algorithms and applications. In Managing and mining graph data, pages 13--68. Springer, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  4. C. C. Aggarwal and H. Wang. A survey of clustering algorithms for graph data. In Managing and mining graph data, pages 275--301. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. C. Aggarwal, H. Wang, et al. Managing and mining graph data, volume 40. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Ahmad, F. Hijaz, Q. Shi, and O. Khan. Crono: A benchmark suite for multithreaded graph algorithms executing on futuristic multicores. In 2015 IEEE International Symposium on Workload Characterization, pages 44--55. IEEE, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Al Hasan et al. Link prediction using supervised learning. In SDM, 2006.Google ScholarGoogle Scholar
  8. M. Al Hasan and M. J. Zaki. A survey of link prediction in social networks. In Social network data analytics, pages 243--275. Springer, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  9. K. Ammar and M. T. Özsu. Wgb: Towards a universal graph benchmark. In Advancing Big Data Benchmarks, pages 58--72. Springer, 2013.Google ScholarGoogle Scholar
  10. T. G. Armstrong, V. Ponnekanti, D. Borthakur, and M. Callaghan. Linkbench: a database benchmark based on the facebook social graph. In ACM SIGMOD, pages 1185--1196, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. A. Bader and K. Madduri. Design and implementation of the hpcs graph analysis benchmark on symmetric multiprocessors. In International Conference on High-Performance Computing, pages 465--476. Springer, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Bassini, M. Danelutto, and P. Dazzi. Parallel Computing is Everywhere, volume 32. IOS Press, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Beamer, K. Asanović, and D. Patterson. The gap benchmark suite. arXiv preprint arXiv:1508.03619, 2015.Google ScholarGoogle Scholar
  14. P. Berkhin. A survey of clustering data mining techniques. In Grouping multidimensional data, pages 25--71. Springer, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  15. M. Besta, A. Carigiet, Z. Vonarburg-Shmaria, K. Janda, L. Gianinazzi, and T. Hoefler. High-performance parallel graph coloring with strong guarantees on work, depth, and quality. In ACM/IEEE Supercomputing, 2020. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Besta and T. Hoefler. Survey and taxonomy of lossless graph compression and space-efficient graph representations. arXiv preprint arXiv:1806.01799, 2018.Google ScholarGoogle Scholar
  17. M. Besta, D. Stanojevic, T. Zivic, J. Singh, M. Hoerold, and T. Hoefler. Log (graph): a near-optimal high-performance graph representation. In Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, page 7. ACM, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. Bilardi and A. Pietracaprina. Models of Computation, Theoretical, pages 1150--1158. Springer US, Boston, MA, 2011.Google ScholarGoogle Scholar
  19. G. E. Blelloch. Problem based benchmark suite, 2011.Google ScholarGoogle Scholar
  20. G. E. Blelloch and B. M. Maggs. Parallel Algorithms, page 25. Chapman & Hall/CRC, 2 edition, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10):P10008, 2008.Google ScholarGoogle Scholar
  22. P. Boldi and S. Vigna. The WebGraph framework I: Compression techniques. In World Wide Web Conf. (WWW), pages 595--601, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. P. Boncz. LDBC: benchmarks for graph and RDF data management. In IDEAS, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. Bron and J. Kerbosch. Algorithm 457: finding all cliques of an undirected graph. CACM, 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Burtscher, R. Nasre, and K. Pingali. A quantitative study of irregular programs on gpus. In 2012 IEEE International Symposium on Workload Characterization (IISWC), pages 141--151. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. V. Carletti et al. Introducing vf3: A new algorithm for subgraph isomorphism. In Springer GbRPR, 2017.Google ScholarGoogle Scholar
  27. V. Carletti et al. The VF3-light subgraph isomorphism algorithm: when doing less is more effective. In Springer S+SSPR, 2018.Google ScholarGoogle Scholar
  28. V. Carletti et al. A parallel algorithm for subgraph isomorphism. In Springer GbRPR, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  29. F. Cazals and C. Karande. A note on the problem of reporting maximal cliques. Theoretical Computer Science, 407(1--3):564--568, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. P. Celis. Robin hood hashing. University of Waterloo, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. D. Chakrabarti and C. Faloutsos. Graph mining: Laws, generators, and algorithms. ACM CSUR, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Chambi, D. Lemire, O. Kaser, and R. Godin. Better bitmap performance with roaring bitmaps. Software: practice and experience, 46(5):709--719, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A benchmark suite for heterogeneous computing. In 2009 IEEE international symposium on workload characterization (IISWC), pages 44--54. Ieee, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. H. Chen, M. Liu, Y. Zhao, X. Yan, D. Yan, and J. Cheng. G-miner: an efficient task-oriented graph mining system. In Proceedings of the Thirteenth EuroSys Conference, page 32. ACM, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. X. Chen, R. Dathathri, G. Gill, and K. Pingali. Pangolin: An efficient and flexible graph mining system on cpu and gpu. arXiv preprint arXiv:1911.06969, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Cheng, L. Zhu, Y. Ke, and S. Chu. Fast algorithms for maximal clique enumeration with limited memory. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1240--1248, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. N. Chiba and T. Nishizeki. Arboricity and subgraph listing algorithms. SIAM Journal on computing, 14(1):210--223, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. A. Ching et al. One trillion edges: Graph processing at facebook-scale. VLDB, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. D. J. Cook and L. B. Holder. Mining graph data. John Wiley & Sons, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. L. P. Cordella et al. A (sub) graph isomorphism algorithm for matching large graphs. IEEE TPAMI, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. M. Danisch et al. Listing k-cliques in sparse real-world graphs. In WWW, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. A. Das et al. Shared-memory parallel maximal clique enumeration. In IEEE HiPC, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. A. Das, M. Svendsen, and S. Tirthapura. Change-sensitive algorithms for maintaining maximal cliques in a dynamic graph. CoRR, vol. abs/1601.06311, 2016.Google ScholarGoogle Scholar
  44. C. Demetrescu, A. V. Goldberg, and D. S. Johnson. The Shortest Path Problem: Ninth DIMACS Implementation Challenge, volume 74. American Math. Soc., 2009.Google ScholarGoogle ScholarCross RefCross Ref
  45. L. Dhulipala et al. Theoretically efficient parallel graph algorithms can be fast and scalable. In ACM SPAA, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. L. Dhulipala, J. Shi, T. Tseng, G. E. Blelloch, and J. Shun. The graph based benchmark suite (gbbs). In GRADES and NDA, pages 1--8, 2020. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. V. Dias et al. Fractal: A general-purpose graph pattern mining system. In ACM SIGMOD, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. R. Diestel. Graph theory. Springer, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. N. Du, B. Wu, L. Xu, B. Wang, and X. Pei. A parallel algorithm for enumerating all maximal cliques in complex network. In Sixth IEEE International Conference on Data Mining-Workshops (ICDMW'06), pages 320--324. IEEE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. J. D. Eblen, C. A. Phillips, G. L. Rogers, and M. A. Langston. The maximum clique enumeration problem: algorithms, applications, and implementations. In BMC bioinformatics, volume 13, page S5. Springer, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. D. Eppstein et al. Listing all maximal cliques in sparse graphs in near-optimal time. In SAAC, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  52. D. Eppstein and D. Strash. Listing all maximal cliques in large sparse real-world graphs. In International Symposium on Experimental Algorithms, pages 364--375. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. P. Erdős and A. Rényi. On the evolution of random graphs. Selected Papers of Alfréd Rényi, 1976.Google ScholarGoogle Scholar
  54. M. Farach-Colton and M. Tsai. Computing the degeneracy of large graphs. In LATIN, 2014.Google ScholarGoogle Scholar
  55. B. Gallagher. Matching structure and semantics: A survey on graph-based pattern matching. In AAAI Fall Symposium: Capturing and Using Patterns for Evidence Detection, pages 45--53, 2006.Google ScholarGoogle Scholar
  56. M. Han, K. Daudjee, K. Ammar, M. T. Özsu, X. Wang, and T. Jin. An experimental comparison of pregel-like graph processing systems. Proceedings of the VLDB Endowment, 7(12):1047--1058, 2014. Google ScholarGoogle ScholarCross RefCross Ref
  57. S. Han, L. Zou, and J. X. Yu. Speeding up set intersections in graph algorithms using simd instructions. In Proceedings of the 2018 International Conference on Management of Data, pages 1587--1602. ACM, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. W.-S. Han et al. Turbo iso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In ACM SIGMOD/PODS. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. W. Hasenplaugh, T. Kaler, T. B. Schardl, and C. E. Leiserson. Ordering heuristics for parallel graph coloring. In 26th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA '14, Prague, Czech Republic - June 23 - 25, 2014, pages 166--177, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. T. Horváth et al. Cyclic pattern kernels for predictive graph mining. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. M. Injadat, F. Salo, and A. B. Nassif. Data mining techniques in social media: A survey. Neurocomputing, 214:654--670, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. A. P. Iyer, Z. Liu, X. Jin, S. Venkataraman, V. Braverman, and I. Stoica. {ASAP}: Fast, approximate graph pattern mining at scale. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18), pages 745--761, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. S. Jabbour, N. Mhadhbi, B. Raddaoui, and L. Sais. Pushing the envelope in overlapping communities detection. In International Symposium on Intelligent Data Analysis, pages 151--163. Springer, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  64. K. Jamshidi, R. Mahadasa, and K. Vora. Peregrine: a pattern-aware graph mining system. In Proceedings of the Fifteenth European Conference on Computer Systems, pages 1--16, 2020. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. R. A. Jarvis and E. A. Patrick. Clustering using a similarity measure based on shared near neighbors. IEEE Transactions on computers, 100(11):1025--1034, 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. C. Jiang, F. Coenen, and M. Zito. A survey of frequent subgraph mining algorithms. The Knowledge Engineering Review, 28(1):75--105, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  67. D. S. Johnson, M. Yannakakis, and C. H. Papadimitriou. On generating all maximal independent sets. Information Processing Letters, 27(3):119--123, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. A. Joshi, Y. Zhang, P. Bogdanov, and J.-H. Hwang. An efficient system for subgraph discovery. In 2018 IEEE International Conference on Big Data (Big Data), pages 703--712. IEEE, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  69. D. R. Karger and C. Stein. A new approach to the minimum cut problem. Journal of the ACM (JACM), 43(4):601--640, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. W. Khaouid et al. K-core decomposition of large networks on a single pc. Proceedings of the VLDB Endowment, 9(1):13--23, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. I. Koch. Enumerating all connected maximal common subgraphs in two graphs. Theoretical Computer Science, 250(1--2):1--30, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. F. Kose, W. Weckwerth, T. Linke, and O. Fiehn. Visualizing plant metabolomic correlation networks using clique-metabolite matrices. Bioinformatics, 17(12):1198--1208, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  73. J. Kunegis. Konect: the koblenz network collection. In Proc. of Intl. Conf. on World Wide Web (WWW), pages 1343--1350. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. V. E. Lee, N. Ruan, R. Jin, and C. Aggarwal. A survey of algorithms for dense subgraph discovery. In Managing and Mining Graph Data, pages 303--336. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. E. A. Leicht et al. Vertex similarity in networks. Physical Review E, 73(2):026120, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  76. D. Lemire, O. Kaser, N. Kurz, L. Deri, C. O'Hara, F. Saint-Jacques, and G. Ssi-Yan-Kai. Roaring bitmaps: Implementation of an optimized software library. Software: Practice and Experience, 48(4):867--895, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  77. J. Leskovec et al. Kronecker graphs: An approach to modeling networks. J. of Machine Learning Research, 11(Feb):985--1042, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. J. Leskovec and A. Krevl. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, June 2014.Google ScholarGoogle Scholar
  79. B. Lessley, T. Perciano, M. Mathai, H. Childs, and E. W. Bethel. Maximal clique enumeration with data-parallel primitives. In 2017 IEEE 7th Symposium on Large Data Analysis and Visualization (LDAV), pages 16--25. IEEE, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  80. D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. Journal of the American society for information science and technology, 58(7):1019--1031, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. H. Lin et al. Shentu: processing multi-trillion edge graphs on millions of cores in seconds. In ACM/IEEE Supercomputing. IEEE Press, 2018.Google ScholarGoogle Scholar
  82. L. Lu, Y. Gu, and R. Grossman. dmaximalcliques: A distributed algorithm for enumerating all maximal cliques and maximal clique distribution. In 2010 IEEE International Conference on Data Mining Workshops, pages 1320--1327. IEEE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. L. Lü and T. Zhou. Link prediction in complex networks: A survey. Physica A: statistical mechanics and its applications, 390(6):1150--1170, 2011.Google ScholarGoogle Scholar
  84. K. Makino and T. Uno. New algorithms for enumerating all maximal cliques. In Scandinavian workshop on algorithm theory, pages 260--272. Springer, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  85. G. Manoussakis. An output sensitive algorithm for maximal clique enumeration in sparse graphs. In 12th International Symposium on Parameterized and Exact Computation (IPEC 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018.Google ScholarGoogle Scholar
  86. D. W. Matula and L. L. Beck. Smallest-last ordering and clustering and graph coloring algorithms. JACM, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. D. Mawhirter, S. Reinehr, C. Holmes, T. Liu, and B. Wu. Graphzero: Breaking symmetry for efficient graph mining. arXiv preprint arXiv:1911.12877, 2019.Google ScholarGoogle Scholar
  88. D. Mawhirter and B. Wu. Automine: harmonizing high-level abstraction and high performance for graph mining. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, pages 509--523. ACM, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. C. McCreesh and P. Prosser. A parallel, backjumping subgraph isomorphism algorithm using supplemental graphs. In CP. Springer, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  90. G. L. Miller et al. Improved parallel algorithms for spanners and hopsets. In ACM SPAA. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Z. Ming, C. Luo, W. Gao, R. Han, Q. Yang, L. Wang, and J. Zhan. Bdgs: A scalable big data generator suite in big data benchmarking. In Advancing Big Data Benchmarks, pages 138--154. Springer, 2013.Google ScholarGoogle Scholar
  92. P. J. Mucci, S. Browne, C. Deane, and G. Ho. Papi: A portable interface to hardware performance counters. In Proceedings of the department of defense HPCMP users group conference, volume 710, 1999.Google ScholarGoogle Scholar
  93. R. C. Murphy et al. Introducing the graph 500. Cray User's Group (CUG), 2010.Google ScholarGoogle Scholar
  94. L. Nai, Y. Xia, I. G. Tanase, H. Kim, and C.-Y. Lin. Graphbig: understanding graph computing in the context of industrial solutions. In SC'15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1--12. IEEE, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. T. J. Ottosen and J. Vomlel. Honour thy neighbour---clique maintenance in dynamic graphs. on Probabilistic Graphical Models, page 201, 2010.Google ScholarGoogle Scholar
  96. S. Parthasarathy, S. Tatikonda, and D. Ucar. A survey of graph mining techniques for biological datasets. In Managing and mining graph data, pages 547--580. Springer, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  97. U. N. Raghavan et al. Near linear time algorithm to detect community structures in large-scale networks. Physical review E, 76(3):036106, 2007.Google ScholarGoogle Scholar
  98. T. Ramraj and R. Prabhakar. Frequent subgraph mining algorithms-a survey. Procedia Computer Science, 47:197--204, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  99. S. U. Rehman, A. U. Khan, and S. Fong. Graph mining: A survey of graph mining techniques. In Seventh International Conference on Digital Information Management (ICDIM 2012), pages 88--92. IEEE, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  100. P. Ribeiro, P. Paredes, M. E. Silva, D. Aparicio, and F. Silva. A survey on subgraph counting: Concepts, algorithms and applications to network motifs and graphlets. arXiv preprint arXiv:1910.13011, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. I. Robinson, J. Webber, and E. Eifrem. Graph databases. "O'Reilly Media, Inc.", 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. R. A. Rossi and N. K. Ahmed. The network data repository with interactive graph analytics and visualization. In AAAI, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. S. E. Schaeffer. Graph clustering. Computer science review, 1(1):27--64, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. T. Schank. Algorithmic aspects of triangle-based network analysis. Phd in computer science, University Karlsruhe, 3, 2007.Google ScholarGoogle Scholar
  105. M. C. Schmidt, N. F. Samatova, K. Thomas, and B.-H. Park. A scalable, parallel algorithm for maximal clique enumeration. Journal of Parallel and Distributed Computing, 69(4):417--428, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. J. Shun and G. E. Blelloch. Ligra: a lightweight graph processing framework for shared memory. In ACM SIGPLAN Notices, volume 48, pages 135--146, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. J. Shun and K. Tangwongsan. Multicore triangle computations without tuning. In Data Engineering (ICDE), 2015 IEEE 31st International Conference on, pages 149--160. IEEE, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  108. C. L. Staudt and H. Meyerhenke. Engineering parallel algorithms for community detection in massive networks. IEEE Transactions on Parallel and Distributed Systems, 27(1):171--184, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. V. Stix. Finding all maximal cliques in dynamic graphs. Computational Optimization and applications, 27(2):173--186, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. J. A. Stratton, C. Rodrigues, I.-J. Sung, N. Obeid, L.-W. Chang, N. Anssari, G. D. Liu, and W.-m. W. Hwu. Parboil: A revised benchmark suite for scientific and commercial throughput computing. Center for Reliable and High-Performance Computing, 127, 2012.Google ScholarGoogle Scholar
  111. M. Svendsen, A. P. Mukherjee, and S. Tirthapura. Mining maximal cliques from a large graph using mapreduce: Tackling highly uneven subproblem sizes. Journal of Parallel and distributed computing, 79:104--114, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. L. Tang and H. Liu. Graph mining applications to social network analysis. In Managing and Mining Graph Data, pages 487--513. Springer, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  113. Y. Tang. Benchmarking graph databases with cyclone benchmark. 2016.Google ScholarGoogle Scholar
  114. B. Taskar et al. Link prediction in relational data. In NIPS, pages 659--666, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. C. H. Teixeira et al. Arabesque: a system for distributed graph mining. In Proceedings of the 25th Symposium on Operating Systems Principles, pages 425--440. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. L. T. Thomas, S. R. Valluri, and K. Karlapalem. Margin: Maximal frequent subgraph mining. ACM Transactions on Knowledge Discovery from Data (TKDD), 4(3):1--42, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. E. Tomita, A. Tanaka, and H. Takahashi. The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci., 363(1):28--42, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. S. Tsukiyama, M. Ide, H. Ariyoshi, and I. Shirakawa. A new algorithm for generating all the maximal independent sets. SIAM Journal on Computing, 6(3):505--517, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  119. J. R. Ullmann. An algorithm for subgraph isomorphism. Journal of the ACM (JACM), 23(1):31--42, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  120. K. Wang, Z. Zuo, J. Thorpe, T. Q. Nguyen, and G. H. Xu. Rstream: marrying relational algebra with streaming for efficient graph mining on a single machine. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18), pages 763--782, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  121. L. Wang, K. Hu, and Y. Tang. Robustness of link-prediction algorithm based on similarity and application to biological networks. Current Bioinformatics, 9(3):246--252, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  122. L. Wang, J. Zhan, C. Luo, Y. Zhu, Q. Yang, Y. He, W. Gao, Z. Jia, Y. Shi, S. Zhang, et al. Bigdatabench: A big data benchmark suite from internet services. In 2014 IEEE 20th international symposium on high performance computer architecture (HPCA), pages 488--499. IEEE, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  123. T. Washio and H. Motoda. State of the art of graph-based data mining. Acm Sigkdd Explorations Newsletter, 5(1):59--68, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  124. B. Wu, S. Yang, H. Zhao, and B. Wang. A distributed algorithm to enumerate all maximal cliques in mapreduce. In 2009 Fourth International Conference on Frontier of Computer Science and Technology, pages 45--51. IEEE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  125. Y. Xu, J. Cheng, and A. W.-C. Fu. Distributed maximal clique computation and management. IEEE Transactions on Services Computing, 9(1):110--122, 2015.Google ScholarGoogle Scholar
  126. Z. Xu, X. Chen, J. Shen, Y. Zhang, C. Chen, and C. Yang. Gardenia: A graph processing benchmark suite for next-generation accelerators. ACM Journal on Emerging Technologies in Computing Systems (JETC), 15(1):1--13, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  127. D. Yan, H. Chen, J. Cheng, M. T. Özsu, Q. Zhang, and J. Lui. G-thinker: big graph mining made easier and faster. arXiv preprint arXiv:1709.03110, 2017.Google ScholarGoogle Scholar
  128. D. Yan, W. Qu, G. Guo, and X. Wang. Prefixfpm: A parallel framework for general-purpose frequent pattern mining. In Proceedings of the 36th IEEE International Conference on Data Engineering (ICDE) 2020, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  129. P. Yao et al. A locality-aware energy-efficient accelerator for graph mining applications. In IEEE/ACM MICRO, pages 895--907. IEEE, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  130. Y. Zhang et al. Genome-scale computational approaches to memory-intensive applications in systems biology. In ACM/IEEE Supercomputing, pages 12--12. IEEE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  131. C. Zhao, Z. Zhang, P. Xu, T. Zheng, and X. Cheng. Kaleido: An efficient out-of-core graph mining system on a single machine. arXiv preprint arXiv:1905.09572, 2019.Google ScholarGoogle Scholar

Index Terms

(auto-classified)
  1. GraphMineSuite: enabling high-performance and programmable graph mining algorithms with set algebra

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the VLDB Endowment
        Proceedings of the VLDB Endowment  Volume 14, Issue 11
        July 2021
        732 pages
        ISSN:2150-8097
        Issue’s Table of Contents

        Publisher

        VLDB Endowment

        Publication History

        • Published: 1 July 2021

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!