The Big Graph Mining (BGM) workshop brings together researchers and practitioners to address various aspects of graph mining in this new era of big data, such as new graph mining platforms, theories that drive new graph mining techniques, scalable algorithms and visual analytics tools that spot patterns and anomalies, applications that touch our daily lives, and more. Together, we explore and discuss how these important facets of are advancing in this age of big graphs.
BGM will be a half-day workshop on Monday, April 7, at WWW 2014 at the Coex Convention and Exhibition Center (Room 314) in Seoul, Korea.
From social networks to language modeling, the growing scale and importance of graph data has driven the development of new graph-parallel systems. In this talk, I will review the graph-parallel abstraction and describe how it can be used to express important machine learning and graph analytics algorithms like PageRank and Latent factor models. I will present how systems like GraphLab and Pregel exploit restrictions in the graph-parallel abstraction along with advances in distributed graph representation to efficiently execute iterative graph algorithms orders of magnitude faster than more general data-parallel systems.
Unfortunately, the same restrictions that enable graph-parallel systems to achieve substantial performance gains also limit their ability to express many of the important stages in a typical graph-analytics pipeline. As a consequence, existing approaches to graph-analytics typically compose multiple systems through brittle and costly file interfaces. To fill the need for a holistic approach to graph-analytics we introduce GraphX, which unifies graph-parallel and data-parallel computation under a single API and system. I will show how a simple set of data-parallel operators can be used to express graph-parallel computation and how, by applying a collection of query optimizations derived from our work on graph-parallel systems, we can execute entire graph-analytics pipelines efficiently in a single distributed fault-tolerant system achieving performance comparable to specialized state-of-the-art systems.
Joseph Gonzalez is a postdoc in the AMPLab at UC Berkeley. Joseph received his PhD from the Machine Learning Department at Carnegie Mellon University where he worked with Carlos Guestrin on parallel algorithms and abstractions for scalable probabilistic machine learning. Joseph is a recipient of the AT&T Labs Graduate Fellowship and the NSF Graduate Research Fellowship.
Talk (20 min)
Talks (20 min each)
|Workshop||Mon, Apr 7|
All papers will be peer reviewed, single-blinded. We welcome many kinds of papers, such as (and not limited to):
Authors should clearly indicate in their abstracts the kinds of submissions that the papers belong to, to help reviewers better understand their contributions. Submissions must be in PDF, written in English, no more than 6 pages long — shorter papers are welcome — and formatted according to the standard double-column ACM Proceedings Style (Tighter Alternate style).
For accepted papers, at least one author must attend the workshop to present the work. Accepted papers will be included in the ACM Digital Library.
Submit your paper at our BGM 2014 submission site.If you plan to extend your workshop paper submitted to our BGM'14 workshop, and submit that extended work to future WWW conferences, please note the following message from the WWW workshop co-chairs: "Any paper published by the ACM, IEEE, etc. which can be properly cited constitutes research which must be considered in judging the novelty of a WWW submission, whether the published paper was in a conference, journal, or workshop. Therefore, any paper previously published as part of a WWW workshop must be referenced and extended with new content to qualify as a new submission to the Research Track at the WWW conference."