Skip Navigation
Search
q

Ookami is a computer technology testbed supported by the National Science Foundation under grant OAC 1927880. It provides researchers with access to the A64FX processor developed by Riken and Fujitsu for the Japanese path to exascale computing and is currently deployed in the until June 2022 fastest computer in the world, Fugaku. It is the first such computer outside of Japan. By focusing on crucial architectural details, the ARM-based, multi-core, 512-bit SIMD-vector processor with ultrahigh-bandwidth memory promises to retain familiar and successful programming models while achieving very high performance for a wide range of applications. It supports a wide range of data types and enables both HPC and big data applications.

The Ookami HPE (formerly Cray) Apollo 80 system has 174 A64FX compute nodes each with 32GB of high-bandwidth memory and a 512 Gbyte SSD. This amounts to about 1.5M node hours per year. A high-performance Lustre filesystem provides about 800 TB storage.

To facilitate users exploring current computer technologies and contrasting performance and programmability with the A64FX, Ookami also includes:

  • 1 node with dual socket AMD Milan (64 cores) with 512 Gbyte memory and 2 NVIDIA V100 GPUs
  • 2 nodes with dual socket Thunder X2 (64 cores) each with 256 Gbyte memory
  • 1 node with dual socket Intel Skylake (36 cores) with 192 Gbyte memory 

Request A Project On Ookami Frequently Asked Questions Submit A Ticket

 

Important notice:
Starting in October 2022 Ookami will be an XSEDE level 2 service provider.
Existing testbed projects will still  have access, though at reduced priority.

 

Announcements

Ongoing Events:

Ookami office hours are taking place twice a week:

  • Tuesdays 10 am - noon EST
  • Thursdays 2 pm - 4 pm EST

Ask questions, get in contact with other users and see what they are working on. Everybody is welcome to join!  If you don't have the invite please contact us.

JUly 12th, 2022:

Join us for the BoF session NSF innovative computing technology testbed community exchange at the PEARC 2022 conference. 1:30pm - 2:20PM EST at studio 2.

Read the report

Also visit our project partners from the University at Buffalo and hear about XDMoD and ACCESS (where Ookami will be a resource provider):

  • 2022-07-11: 0830-1700: Arlington: "Open OnDemand, Open XDMoD, and ColdFront - an HPC center management toolset"
  • 2022-07-12: 1030-1100: Arlington: "Performance Optimization of the Open XDMoD Datawarehouse - best Full paper"
  • 2022-07-12: 1130-1145: Studio 2: "Developing Accurate Slurm Simulator
  • 2022-07-12: 1200-1330 :The Square: "Meeting ACCESS: an opportunity to discuss the new NSF Cyberinfrastructure with the ACCESS PIs"
  • 2022-07-12: 1330-1430: Studio 1: "Open OnDemand User Group Meeting" (BoF)
  • 2022-07-13: 1330-1430: Studio 1: "XDMoD" (BoF)
  • 2022-07-14: 0900-1000: Arlington: "ColdFront HPC Resource Allocation Management System User Group Meeting" (BoF)
  • 2022-07-14: 1000-1230: Ballroom A: "ACCESSing the Future of Research Computing: A Panel Discussion with Principals of the National Sciene"

JUNE 15th, 2022:

At 2pm EST we will host a general webinar about the Ookami cluster. This will provide an overview of the system, its characteristics and features, as well as insights on what applications benefit from the A64FX architecture. The webinar is targeted especially for XSEDE users, who are considering submitting an allocation proposal in the upcoming submission period (June 15th - July 15th) and allocations starting in October.

May 19th, 2022:

James Custer from HPE will hold a webinar about the Cray Performance Tools and how to use them on Ookami. The webinar will take place from 2pm - 4pm EST with enough time for questions.
The Cray Performance Measurement and Analysis Tools (or CrayPat) are a suite of optional utilities that enable the user to capture and analyze performance data generated during the execution of a program on a Cray system. The information collected and analysis produced by use of these tools can help the user to find answers to two fundamental programming questions: How fast is my program running? and How can I make it run faster?

Recording

May 17th, 2022:

On Tuesday May 17th, starting at 9:00 AM EDT, we will be performing the second and final phase of security and reliability updates. That maintenance is expected to be completed by the end of business the same day. During that outage, the Ookami login nodes and queues will not be available, and jobs will need to be restarted after the maintenance window.

[UPDATE] Ookami is back online and accessible.

May 3rd, 2022:

We will be performing security and reliability updates on Ookami starting 9:00 AM EDT on Tuesday May 3rd. The maintenance is expected to be completed by the end of business the same day. During the outage, the Ookami login nodes and queues will not be available, and jobs will need to be restarted after the maintenance window.

We apologize for the inconvenience and thank you for your patience while we perform these necessary updates.

[UPDATE] Ookami is back online and accessible.

March 24th, 2022:

A general Ookami webinar for everybody interested in this resource took place. It covered the most important topics from technical details about A64FX to the allocation process and the XSEDE integration.

February 10th, 2022:

From 2pm - 5pm EST the Ookami user group meeting took place. Read more

January 13th, 2022:

From 2pm - 4pm EST a webinar about Chapel took place. The recording and slides can be found here.

Tired of mixing multiple programming notations and models to make use of HPC systems?  Come hear about Chapel, a productive, unified language for scalable parallel programming that is being used for scalable applications as diverse as Computational Fluid Dynamics and interactive Data Science.

November 15th-19th, 2021:

The SC conference took place.
There were three talks about the work done on Ookami:

  • Su November 14, 9.28-944am CST, HPCSysPros Workshop: Lightning Talk: Ookami – The First Year of a Computing Technology Testbed
  • Su November 14, 11- 11.15am CST, EduHPC Workshop: Educating HPC Users in the use of advanced computing technology
  • We November 17, 5.50 - 5.58pm CST, Arm HPC user group BoF session: One year of operations - experiences on the Ookami A64FX testbed

November 11th-12th, 2021:

SC 2021 - ARM HPC user group hackathon. Ookami provides guest accounts for participants.

October 27th, 2021

The thirteenth annual concurrent collections workshop will take place.There will be a talk about Ookami

We, 27 October 2.45 - 3.15pm "Introduction to Ookami supercomputer"

October 7th, 2021

We will be performing security and reliability updates on Ookami starting 9:00 AM EST on Thursday October 7th. The maintenance is expected to be completed by the end of business the same day. During the outage, the Ookami login nodes and queues will not be available, and running jobs will need to be restarted after the maintenance window.

We apologize for the inconvenience and thank you for your patience while we perform these necessary updates.

[UPDATE] Ookami is back online and accessible.

July 27TH, 2021

We had a webinar  from HPC@FAU. It will cover their tools likwid, OSACA, and the roofline-model for SpMV.
Recordings and slides will be available soon.

July 22TH, 2021

Join panelist and IACS Director Robert Harrison on July 22 at 9:00 PDT / 12:00 EDT / 17:00 BST, for The Next Platform: The Future of Supercomputing is Happening Now, taking place on The Next Platform’s homepage.

July 13TH, 2021

We will had a webinar about the Parallelware Analyzer by Appentra. Watch the recording here.

June 4TH, 2021

Ookami is currently not accessible. We are investigating the issue and will keep you posted.

[UPDATE] Ookami is back online and accessible.

May 25TH, 2021

Login1 will be rebooted at 1pm. Please log off by 12:45pm.

[UPDATE] Login1 is running and Ookami is accessible again.

APRIL 28TH, 2021

At 1:30pm EST we had an introduction to Ookami. This webinar is mainly for people interested in the system and thinking about getting accounts on Ookami.
Find the slides and recordings in our documentation section.

APRIL 20TH, 2021

We will had a workshop on TAU from 10am - noon taking place within the weekly hakathon.
TAU Performance System® is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, UPC, Java, Python.
Find the recordings in our documentation section

MARCH 23rd, 2021

We will had an introduction to XDMoD from 10 - 10.30 am. It will take place within the weekly hakathon.
Recordings are available here.

March 3rd, 2021

The Ookami webinar will give an introduction to the system and demonstrate its usage. Everybody interested in Ookami can join via zoom.
03/03/2021: 2 pm - 3 pm EST
Find the slides here.

FEBRUARY 24TH - 26TH, 2021

We are delighted to announce, that there will be a hands-on training session with arm. The topics will cover the toolchains from arm, Fujitsu, Cray and open source, tuning for A64FX, and many more. This will enable you to get the most out of your computations.
The sessions will be on
02/24/2021: 1 pm – 3 pm EST
02/25/2021: 1 pm – 3 pm EST
02/26/2021: 1 pm – 3 pm EST
You can find the material of the hackathon here.

JanUARY 26Th, 2021:

We are happy to announce, that the Ookami had a successful review of the 1st project year. The system is accepted and all vendors are paid. Join us in exploring the system's capabilities and get an account.

January 25TH, 2021:

We will be performing security and reliability updates on Ookami starting at 9:00 AM EST on Thursday January 28th. The maintenance is expected to be completed by noon the same day. During the outage, the Ookami login nodes and queues will not be available, and running jobs will need to be restarted after the maintenance window.

 We apologize for the inconvenience and thank you for your patience while we perform these necessary updates.

January 4th, 2021:

SiegmannThe Ookami lead research scientist Dr. Eva Siegmann joined our team. Eva holds a Ph.D. in applied mathematics. She is a specialist in scientific computing with a strong background in particle simulations. Since 2012 Eva is working in academia and on industrial projects with leading companies and research groups. Before coming to SBU she was located in Austria.
Eva is looking forward to support the Ookami users and have fun with the system.

DECEMBER 28th, 2020:

The Ookami cluster is currently inaccessible. We are working on a resolution and will provide another update once the issue is resolved.

[UPDATE] The issue has been resolved and Ookami is now once again accessible.

In addition, we were just informed of scheduled electrical maintenance at the CEWIT Data Center on the circuits powering the Ookami cluster. In anticipation of this necessary maintenance, the Ookami cluster will be going down on Tuesday 12/29/2020 at the end of business and coming back online by noon on Wednesday 12/30/2020. We thank you for your patience while these important updates are completed

 December 21st, 2020:

Due to network issues, the Ookami cluster is currently inaccessible. We are working with the networking teams on a resolution and will provide another update once the issue is resolved.

[UPDATE] Network issues now resolved.

November 19th, 2020: 

In order to upgrade our storage system, we will have scheduled maintenance on the Ookami cluster starting at 10 AM on Monday November 23rd. The maintenance is expected to be completed by the start of business on Tuesday November 24th.

During this outage, all Ookami Login nodes and queues will be unavailable. We apologize for the inconvenience and thank you for your patience while we work through these necessary upgrades.