Subscribe
About

SKA: the ultimate big data challenge

Samantha Perry
By Samantha Perry, co-founder of WomeninTechZA
Johannesburg, 16 May 2013
Simon Ratcliffe, DOME SA, says this project lays the foundation to help the scientific community solve other data challenges such as climate change, genetic information and personal medical data. Photography: Sean Wilson
Simon Ratcliffe, DOME SA, says this project lays the foundation to help the scientific community solve other data challenges such as climate change, genetic information and personal medical data. Photography: Sean Wilson

The human race is on a journey to deep space. The mission: to explore the origins of the universe. Using the Square Kilometre Array (SKA) radio telescope, radio signals from deepest space will be collected for scientists the world over to analyse and process. Their challenge? How to transport, store and process the estimated 14 exabytes of data the antennas will gather every day.

The SKA is a radio telescope being built by 10 countries, including SA. It's the largest joint project of its kind ever attempted and will result in the construction of the largest and most sensitive radio telescope ever built.

"The SKA will see back to a time before the first stars lit up," says IBM. "Optical telescopes see the light from stars. Before stars existed, there was only gas; a radio telescope with the sensitivity of the SKA can see back in time to the gas that existed before stars were born."

The antennas that make up the array will be distributed across SA and Australia - literally millions of them. "Because the telescope is to be made up of so many individual antennas, and the antennas are so widely scattered, and such a large volume of data is being gathered, a novel computing system must be developed to manage the process of gathering, storing and analysing data from end to end," says an IBM whitepaper on the system.

To this end, IBM is collaborating with the Netherlands Institute for Radio Astronomy (ASTRON) on a five-year project called DOME, which aims to design a computing system that could manage the data SKA is going to generate. SA's National Research Foundation has joined the initiative, in a four-year collaboration. DOME will see research conducted into extremely fast but low-power exascale computing systems.

The real bottleneck comes when the data reaches the computers where it's processed and analysed.

The collaboration includes a user platform where organisations from around the world can jointly investigate emerging technologies in high-performance, energy-efficient computing, nanophotonics and data streaming, says IBM. SA has now joined as a user platform partner.

Collaboration

The DOME collaboration brings together a dream team of scientists and engineers in an exciting partnership of public and private institutions. "This project lays the foundation to help the scientific community solve other data challenges such as climate change, genetic information and personal medical data," says Simon Ratcliffe, technical co-ordinator, DOME South Africa.

Introducing SKA

The project is led by the SKA Organisation, a not-for-profit company headquartered in Jodrell Bank Observatory, near Manchester, UK. It was established in December 2011 to formalise relationships between the international partners and centralise the leadership of the project.
The SKA will be built in southern Africa and Australia. There will be 3 000 dish antennas, each about 15m in diameter, as well as two other types of radio wave receptor, known as low- and mid-frequency aperture array antennas. The mid-frequency aperture arrays will be built in SA and are envisaged to be a major component of the SKA Phase 2. The antennas will be arranged in five spiral arms and the dishes in southern Africa will extend to distances of at least 3 000km from the centre of the core region. Construction of the SKA is expected to begin in 2017 and conclude in 2024.
SKA SA was established by the Department of Science and Technology of SA and is administered as a business unit of the National Research Foundation (NRF).
Source: IBM

Scientists from all three organisations will collaborate remotely and at the newly established ASTRON & IBM Center for Exascale Technology, in Drenthe, the Netherlands, IBM said in a statement announcing the collaboration.

According to IBM, scientists from SKA South Africa will focus on:

* Visualising the challenge - fundamental research will be conducted into signal processing and advanced computing algorithms for the capture, processing and analysis of the SKA data so clear images can be produced for astronomers to study;

* Desert-proof technology - the DOME team is researching and prototyping micro-server architectures based on liquid-cooled 3D stacked chips. The team in SA will extend this research to make the micro-servers rugged, or 'desert-proof', to handle the extreme environmental conditions where the SKA will be located; and

* Software analytics - the 64 dishes of the MeerKat telescope in SA will be used for the testing and development of a sophisticated software program that will aid in the design of the entire computing system holistically and optimally, taking into account all of the cost and performance trade-offs for the eventual 3 000 SKA dishes.

Breaking it down

ASTRON and IBM have mapped out seven technology projects aimed at dealing with the extreme data-handling requirements of the SKA, IBM says. All have fundamental implications for computing in the future.

FACTOIDS

* Using the systems available today to process the data that will be generated by the SKA will take the equivalent of two nuclear power stations - about seven gigawatts a year - to power it, says Dr Ton Engbersen, DOME project leader, IBM Research.
* The data collected by the SKA in a single day would take nearly two million years to play back on an iPod.
* The SKA central computer will have the processing power of about 100 million PCs.
* The SKA will use enough optical fibre to wrap twice around the Earth.
* The dishes of the SKA will produce 10 times the global Internet traffic.
* The aperture arrays in the SKA could produce more than 100 times the global Internet traffic.
* The SKA will generate enough raw data to fill 15 million 64GB iPods every day.
* The SKA supercomputer will perform 1 018 operations per second - equivalent to the number of stars in three million Milky Way galaxies - in order to process all the data the SKA will produce.
* The SKA will be so sensitive that it will be able to detect an airport radar on a planet 50 light years away.
* The SKA will contain thousands of antennas with a combined collecting area of about one square kilometre (that's one million square metres!).
* Analysts estimate the London Olympics was the most data-heavy yet - with some 60GB, the equivalent of 3 000 photographs, travelling across the network in the Olympic Park every second. This, however, is only equivalent to the data rate from about half a low-frequency aperture array station in SKA phase one.
Source: The SKA Organisation

"The most strategic of the DOME projects is called Algorithms & Machines. The SKA challenge is so extreme and nobody has designed a data management system to handle anything like this before. So the goal here is to create an ultra-sophisticated software program that will help the team design the system holistically and optimally," IBM says.

Also of critical importance is the Access Patterns team, responsible for developing a big data repository capable of affordably handling the huge volume of data SKA will generate every day.

Transporting the data hundreds or thousands of kilometres from the antenna that collects it to the data centre where it will be stored or processed will require a high-speed fibre-optic network that can move data at 100x the rate of today's Internet traffic. The RT Communications project is working on reducing the overhead that data traditionally generates as it travels through a network. The Compressive Sampling project will work towards reducing the data that SKA creates by compressing it as it streams in. Processing the data will be handled by hybrid machines combing supercomputers and a processor called an accelerator. Complementing these machines are DOME micro-servers, which are very small, inexpensive, energy-efficient microprocessors that can handle data filtering and analysis close to the antennas. "The real bottleneck comes when the data reaches the computers where it's processed and analysed," says IBM. "The computers transport data internally via electronic bits moving on copper wires. So the SKA will be like attaching a fire hose to a garden sprinkler. DOME's Nanophotonics team, led by IBM researcher Bert Offrein, is taking photonics technology that IBM was already developing for general computing and applying it to the SKA challenge."

"The DOME research has implications far beyond astronomy. These scientific advances will help build the foundation for a new era of computing, providing technologies that learn and reason. Ultimately, these cognitive technologies will help to transform entire industries, including healthcare and finance," says Dr Ton Engbersen, DOME project leader, IBM Research. It is these other implications that make the SKA project so significant for the sector, for science, and for humanity.

First published in May issue of Brainstorm magazine.

Share