Stony Brook-Brookhaven Collaboration Aims to Visualize Big Data

0

Humans are visual creatures: our brain processes images 60,000 times faster than text, and 90 percent of information sent to the brain is visual. Visualization is becoming increasingly useful in the era of big data, in which we are generating so much data at such high rates that we cannot keep up with making sense of it all. In particular, visual analytics—a research discipline that combines automated data analysis with interactive visualizations—has emerged as a promising approach to dealing with this information overload.

xu

Stony Brook computer scientist Wei Xu

“Visual analytics provides a bridge between advanced computational capabilities and human knowledge and judgment,” said Wei Xu, a research assistant professor in the Department of Computer Science at Stony Brook University and a computer scientist in the Computational Science Initiative (CSI) at the U.S. Department of Energy’s (DOE) Brookhaven National Laboratory.

“The interactive visual representations and interfaces enable users to efficiently explore and gain insights from massive datasets.”

At Brookhaven, Xu has been leading the development of several visual analytics tools to facilitate the scientific decision-making and discovery process. She works closely with Brookhaven scientists, particularly those at the National Synchrotron Light Source II (NSLS-II) and the Center for Functional Nanomaterials (CFN)—both DOE Office of Science User Facilities. By talking to researchers early on, Xu learns about their data analysis challenges and requirements. She continues the conversation throughout the development process, demoing initial prototypes and making refinements based on their feedback. She also does her own research and proposes innovative visual analytics methods to the scientists.

Recently, Xu has been collaborating with the Visual Analytics and Imaging (VAI) Lab at Stony Brook University—her alma mater, where she completed doctoral work in computed tomography with graphics processing unit (GPU)-accelerated computing.

Though Xu continued work in these and related fields when she first joined Brookhaven Lab in 2013, she switched her focus to visualization by the end of 2015.

“I realized how important visualization is to the big data era,” Xu said. “The visualization domain, especially information visualization, is flourishing, and I knew there would be lots of research directions to pursue because we are dealing with an unsolved problem: how can we most efficiently and effectively understand the data? That is a quite interesting problem not only in the scientific world but also in general.”

It was at this time that Xu was awarded a grant for a visualization project proposal she submitted to DOE’s Laboratory Directed Research and Development program, which funds innovative and creative research in areas of importance to the nation’s energy security. At the same time, Klaus Mueller—Xu’s PhD advisor at Stony Brook and director of the VAI Lab—was seeking to extend his research to a broader domain. Xu thought it would be a great opportunity to collaborate: she would present the visualization problem that originated from scientific experiments and potential approaches to solve it, and, in turn, doctoral students in Mueller’s lab would work with her and their professor to come up with cutting-edge solutions.

Color Map

The color-mapping tool was used to visualize a multivariable fluorescence dataset from the Hard X-ray Nanoprobe (HXN) beamline at Brookhaven’s National Synchrotron Light Source II. The color map (a) shows how the different variables—the chemical elements cerium (Ce), cobalt (Co), iron (Fe), and gadolinium (Gd)—are distributed in a sample of an electrolyte material used in solid oxide fuel cells. The fluorescence spectrum of the selected data point (the circle indicated by the overlaid white arrows) is shown by the colored bars, with their height representing the relative elemental ratios. The fluorescence image (b), pseudo-colored based on the color map in (a), represents a joint colorization of the individual images in (d), whose colors are based on the four points at the circle boundary (a) for each of the four elements. The arrow indicates where new chemical phases can exist—something hard to detect when observing the individual plots (d). Enhancing the color contrast—for example, of the rectangular region in (b)—enables a more detailed view, in this case providing better contrast between Fe (red) and Co (green) in image (c).

This Brookhaven-Stony Brook collaboration first led to the development of an automated method for mapping data involving multiple variables to color. Variables with a similar distribution of data points have similar colors. Users can manipulate the color maps, for example, enhancing the contrast to view the data in more detail. According to Xu, these maps would be helpful for any image dataset involving multiple variables.

“Different imaging modalities—such as fluorescence, differential phase contrasts, x-ray scattering, and tomography—would benefit from this technique, especially when integrating the results of these modalities,” she said. “Even subtle differences that are hard to identify in separate image displays, such as differences in elemental ratios, can be picked up with our tool—a capability essential for new scientific discovery.” Currently, Xu is trying to install the color mapping at NSLS-II beamlines, and advanced features will be added gradually.

In conjunction with CFN scientists, the team is also developing a multilevel display for exploring large image sets. When scientists scan a sample, they generate one scattering image at each point within the sample, known as the raw image level. They can zoom in on this image to check the individual pixel values (the pixel level). For each raw image, scientific analysis tools are used to generate a series of attributes that represent the analyzed properties of the sample (the attribute level), with a scatterplot showing a pseudo-color map of any user-chosen attribute from the series—for example, the sample’s temperature or density. In the past, scientists had to hop between multiple plots to view these different levels. The interactive display under development will enable scientists to see all of these levels in a single view, making it easier to identify how the raw data are related and to analyze data across the entire scanned sample. Users will be able to zoom in and out on different levels of interest, similar to how Google Maps works.

— Ariana Tantillo, Brookhaven National Laboratory

Share.

About Author

Leave A Reply