Google Cloud’s asteroid hunt could mean better AI models for astronomy

  • Google Cloud helped the Asteroid Institute massively scale up an algorithm it built to identify potential asteroids — and found nearly 30,000 candidates in just a few weeks 

  • The pair are now working on a new AI model to help weed out false IDs

  • They plan to open source the model training data to help other scientists

Forget Bruce Willis and Ben Affleck. Google Cloud and the Asteroid Institute have teamed up to form a new superhero duo protecting Earth from wayward asteroids.  

As it turns out the premise of the 1998 classic “Armageddon” wasn’t entirely far-fetched – humans can apparently deflect asteroids headed straight for Earth (more on that in a sec). We just have to know they’re coming first. And to do that, well, let’s just say it takes a bit of heavy computational lifting.

Cue the cloud.

Identifying and tracking asteroids involves looking at photos of space and painstakingly trying to track little dots of light across images taken by different telescopes at different times. By pinpointing where these objects are in space at different points in time, scientists can calculate their orbit and thus determine whether or not an asteroid is on a crash course with humanity.

Ed Lu, executive director at the Asteroid Institute, told Fierce Network the institute doesn't actually have its own telescope, so it has to comb through publicly available image datasets. Complicating matters further, asteroids are typically identified with telescopes using a very specific process and most of the images the institute uses are taken by telescopes that aren't being used for asteroid ID.

Suffice to say the process can be a bit tricky. Still, Lu's team of scientists was able to develop an algorithm to streamline this process. However, they quickly encountered a problem.

“What we realized though was if you want to do this repeatedly over lots and lots and lots of data, that wasn’t going to work all on a laptop machine. In fact, you need the equivalent of millions of those laptop machines,” he said. “We used Google Cloud’s built-in tools in order to scale up an algorithm we already had working — and scale it up massively.”

Massimo Mascaro, Technical Director in Google Cloud's Office of the CTO, said the hyperscaler worked with Lu’s team to containerize the algorithm using Google Kubernetes Engine. And the process was intense.

“These algorithms are not just simple pieces of code, they usually come with a lot of dependencies around needing special kinds of images, running special kinds of computation,” he explained. “So, the problem is not just to get a bunch of machines, it’s how do you deploy efficiently a coherent set of system assets in order to run that code and how do you schedule it in an efficient way.”

But it seems worth it when you see the result: Using Google Cloud’s compute muscle, the Asteroid Institute was able to identify more than 27,000 new potential asteroids in Q1 2024 alone. That includes around 100 “near-Earth” objects. Gasp!

asteroid discovery chart

Avoiding impact

Lu explained there are actually a few reasons we need to know where asteroids are. One is, of course, for science! That is, asteroids are an important record of the history of our solar system, he said. The second is for the future – Lu noted asteroids are likely places we’ll navigate to and around in the future, and they also likely contain critical resources like water and minerals that future space explorers will need.

And the third reason brings us back to “Armageddon.”

“Sometimes these things hit the Earth,” he said, pointing to the Chelyabinsk asteroid that hit Russia 10 years ago and the infamous Tunguska Event that occurred in the early 1900s. “In order to do something about it, we have to find them and track them."

"It turns out deflecting asteroids is relatively straightforward if you know that one is coming,” he added.

Excuse me?

As Lu explained, NASA recently proved out our ability to knock asteroids off a collision course with Earth a few years back as part of its DART mission. In layman’s terms, the process kind of works like bumper cars, except it’s a spacecraft bumping into a space rock. On a long enough timeline, a small bump is enough to make a big difference, he said.

Asteroid AI

Though they’ve already identified tens of thousands of potential asteroids, Google Cloud and the Asteroid Institute have more work to do. According to Lu and Mascaro, the next step is developing an artificial intelligence (AI) model to help scientists sift through all the new asteroid candidates they’ve found.

Lu said now that the initial identification work is complete, scientists need to do a more detailed screening of the images they’ve found to ensure that what’s pictured is actually an asteroid and not — for instance — a nebula, galaxy or cosmic ray that fooled the algorithm. That means the bottleneck has shifted from identification to verification.

Verification has historically been a manual process, but Lu said that means scientists have inadvertently created a training data set that is the “perfect fodder” for creating a custom AI model. That’s what Google Cloud and the institute are working on now.

depiction of asteroids found in solar system
Discoveries visualized in the inner Solar System. Main belt asteroid discoveries are shown in green. (B612 Asteroid Institute / University of Washington DiRAC Institute / OpenSpace Project)

Mascaro said one issue they’ve encountered is that existing models have largely been trained on “worldly” images. Think cats, bicycles and so on. In this case, they’re trying to train a model using an entirely new kind of imagery.

The cool part is that all the work they’re putting into building this AI won’t just help this project. Lu and Mascaro said the institute plans to open source the data set it is using.

While this is mostly meant to facilitate trust with a wary public in the event that an Earth-bound asteroid is found, it also means other astronomers, astrophysicists and scientists will be able to leverage the training data for a variety of tasks requiring space-based inputs.

“We don’t expect this to be the last step in building this kind of AI,” Mascaro said.

Scientific tipping point

According to Mascaro, Google Cloud’s collaboration with the Asteroid Institute isn’t meant to be a one-off initiative. Instead, he said the company’s Office of the CTO uses projects like these as learning experiences whose lessons can be applied more broadly across Google products for all industries.

“This whole idea of using data that has not been collected for a given purpose and readapting to a different purpose, it is a big problem in the industry way beyond discovering asteroids. And that is something I am working on with other big commercial partners of Google," he said. 

"To be honest, this is the tip of the iceberg in terms of computational capabilities being applied more massively to science, and in particular cloud capabilities,” Mascaro continued.

While the scientific community has stepped up its use of computational resources over the past few decades, Mascaro noted that hasn’t necessarily translated into cloud adoption. That’s in part due to funding restrictions, fear of the unknown and a desire among academic institutions to leverage the computational muscle they’ve invested in on their own.

But Mascaro pointed out it’s hard for most to keep up with hyperscalers when it comes to offering the latest and greatest hardware and sheer scale.

“I think actually we are getting really close to a tipping point where cloud in general and public cloud in particular are going to become much, much more core to what science does and how science works because the advantages of scaling out computation and the need to scale out computation is becoming so predominant in the scientific world,” he concluded.