Cambridge Centre for
Data-Driven Discovery

Scalable Data Processing for Big Data from Laptop, Multi-core, to Cluster Computing

Thursday, 14 July 2016, 9.00am to 5.00pm

Location: University of Cambridge Computer Laboratory (FW26)

Organisers

Eiko Yoneki (University of Cambridge)

Thomas Heinis (Imperial College London)

Timothy Jones (University of Cambridge)

Large scale data analytics is emerging as a huge consumer of computational resources due to its complex, data-hungry algorithms. Especially graph/networked data analysis are becoming increasingly important for solving multiple problems in diverse fields. As these problems grow in scale, parallel computing resources are required to meet the computational and memory requirements. Notably, the algorithms, software and hardware that have worked well for developing mainstream parallel applications are not usually effective for massive-scale data from the real world, which exhibits more complex structure. Research into large scale data processing is currently at a fragmented stage. This workshop brings researchers from systems, computer architecture, algorithms and databases to discuss emerging trends and to identify opportunities for future advancement in data processing.

The workshop takes the format of presentations by key researchers and discussions on specific topics. The workshop has the following goals.

Identify clear application areas and algorithms that are of interest and representative of large scale data processing as a whole.
Close the gap between domain algorithms and systems researchers. In particular algorithm designers are often unaware of the constraints imposed by systems and the best way to consider these when designing graph algorithms for big (graph) data. On the other hand the systems community often misses advances in algorithm design that can be used to cut down processing time and scale up systems in terms of the size of the problem they can address.
Build some consensus on programming paradigms. Currently this effort is fragmented between researchers building domain specific languages or databases for executing and storing them and researchers trying to fit existing systems and means to program them to applications. Closing this gap is critical to become available to the wider network science community as well as to open up whole new research areas such as algorithm independent optimisation.

For further details, and to view the programme and abstracts, click here

Search form

Cambridge Centre for Data-Driven Discovery

Scalable Data Processing for Big Data from Laptop, Multi-core, to Cluster Computing

Cambridge Centre for
Data-Driven Discovery