Skip Navigation
Search

 

q

Ookami is a computer technology testbed supported by the National Science Foundation under grant OAC 1927880. It provides researchers with access to the A64FX processor developed by Riken and Fujitsu for the Japanese path to exascale computing and is currently deployed in the until June 2022 fastest computer in the world, Fugaku. It is the first such computer outside of Japan. By focusing on crucial architectural details, the ARM-based, multi-core, 512-bit SIMD-vector processor with ultrahigh-bandwidth memory promises to retain familiar and successful programming models while achieving very high performance for a wide range of applications. It supports a wide range of data types and enables both HPC and big data applications.

The Ookami HPE (formerly Cray) Apollo 80 system has 176 A64FX compute nodes, as well as 2 dedicated debug nodes, each with 32GB of high-bandwidth memory and a 512 Gbyte SSD. This amounts to about 1.5M node hours per year. A high-performance Lustre filesystem provides about 800 TB storage.

To facilitate users exploring current computer technologies and contrasting performance and programmability with the A64FX, Ookami also includes:

  • 1 node with dual socket AMD Milan (64 cores) with 512 Gbyte memory 
  • 2 nodes with dual socket Thunder X2 (64 cores) each with 256 Gbyte memory
  • 1 node with dual socket Intel Skylake (36 cores) with 192 Gbyte memory 

 

Allocations

Since October 2022 Ookami is an ACCESS service provider. We strongly encourage researchers interested in Ookami to submit an ACCESS allocation request. ACCESS allocation

For the moment allocations via Stony Brook are also still accepted - Request A Project On Ookami via Stony Brook allocation
But we strongly encourage researchers to request allocations via ACCESS.

Frequently Asked Questions   Submit A Ticket

Contact: ookami_computer@stonybrook.edu

Important notice:
Since October 2022 Ookami is an ACCESS research provider.
Existing testbed projects still have access, though at reduced priority.

 

 

Announcements

Ongoing Events:

Ookami office hours are taking place twice a week:

  • Tuesdays 10 am - noon EST
  • Thursdays 2 pm - 4 pm EST

Ask questions, get in contact with other users and see what they are working on. Everybody is welcome to join!  If you don't have the invite please contact us.

Upcoming Events

April 4, 2024:

3rd Ookami User Group Meeting

The purpose of this meeting is to share experiences, learn from each other, get feedback and have a fruitful discussion about A64FX and Ookami.
The meeting will take place from 2 - 4 pm EST via Zoom and will feature several 5 to 10 minute talks from users. If you are interested in presenting, please indicate so during in the registration form.
Register

Have a look at our previous user group meetings (2022, 2023).

Past Events

March  7, 2024:

Ookami Open OnDemand Webinar

Ookami is now available via Open OnDemand, an intuitive, innovative, and interactive interface to remote computing resources. Open OnDemand helps computational researchers and students efficiently utilize remote computing resources by making them easy to access from any device. In this webinar we will show you how to use Open OnDemand on Ookami.

Slides
Recording will be available soon

February 29, 2024:

Ookami Webinar

Whether you are  interested in Ookami and consider getting an account, a new user, or a longtime user, who wants to optimize their usage, this webinar is for you!
It will cover the basics of the system, a lot of tips and tricks on how to use it efficiently, and how to get an account. 
Slides

November 12-17, 2023: Supercomputing 2023

July 18th & August 8th, 2023:

Webinar about Linaro Forge

In this presentation we will provide an overview of Linaro Forge, a cross platform, integrated environment for debugging and optimizing parallel codes at any scale. We will provide hands-on demonstrations of how Linaro Forge reduces development time, simplifies debugging, and eases application performance enhancement.

Ensuring Program Correctness with Linaro DDT
Using sample codes, we will walk through the major capabilities of the debugger to illustrate how DDT can debug applications ranging from a single thread to large scale

  • Using Forge as a remote client
  • Using semantic analysis tools to catch bugs before you even run the code
  • How to use sparklines to visualize variable values across processes and threads
  • Illustrate memory debugging to trap array out of bounds errors and memory leaks
  • Using the array viewer to visualize multi-dimensional variables
  • Using watchpoints to halt execution dependent upon expression values
  • Offline debugging for large scale debugging, catching non-deterministic errors and continuous integration
  • Trace points, a flexible and deterministic printf alternative
  • Modifying the definition of your program without re-building using the Evaluate Window
  • Debugging on a GPU (optional)
  • Debugging Python (optional)

Performance Engineering with Linaro Performance Reports and Linaro MAP
We will illustrate how in a matter of minutes you can understand the nature of your application’s performance. We will introduce best practices to attain and maintain optimal performance.

  • Understanding the performance road map
  • Characterize file IO behavior
  • Isolate workload imbalance issues in codes at any scale
  • See how the amount of time spent in memory operations varies over time and processes
  • Determine how well your application is vectorized
  • Profiling Python applications (optional)

Speakers: Beau Paisley and Rudy Shand, Linaro

Recordings here

July 23-27th, 2023:

We will attend the PEARC'23 conference in Portland Oregon. Meet us at:

  • Workshop: The Taming of the Wolf - or how to use the Ookami Cray Apollo 80 system and its Fujitsu A64FX processors; Mo July 24th, 8.30 am - 4.30 pm PDT (slides)
  • Presentation: From Molecular Dynamics to Oceanography - Ookami Graduate Students Porting and Tuning Science Codes for A64FX
  • Presentation: A Further Study of Linux Kernel Hugepages on A64FX with FLASH, an Astrophysical Simulation Code

May 7th -11th, 2023:

The Cray User Group meeting (CUG). Will take place in Helsinki, Finnland.  Ookami is an Cray Apollo 80 system and we are looking forward to share our experiences in the following talks:

  • Su, May 7th: Programming Environments, Applications, and Documentation session
  • Tue, May 9th, 2 - 2.30 pm EEST: Nikolay Simakov, Benchmarking High-End ARM Systems with Scientific Applications. Performance and Energy Efficiency
  • Th, May 11th, 11.30 -2 pm EEST: Eva Siegmann, The Ookami Apollo80 system: Progress, Challenges and Next Steps
  • Th, May 11th, 2.45 - 3.15 pm EEST: Smeet Chheda, Performance Study on CPU-based Machine Learning with PyTorch 

April 26th, 2023:

NSF technical talk: Ookami - Experiences during the First Three Years of a Computing Technology Testbed by Eva Siegmann

MARCH 23rd, 2023:

The 2nd Ookami user group meeting took place on Thursday March 23rd, 2pm EST.
The purpose of this meeting is to share experiences, learn from each other,  get feedback and have a fruitful discussion about A64FX and Ookami.

December 8th, 2022:

We are holding a webinar about using the Julia programming languange on A64FX.  Mose Giordano and Valentin Churavy, both Julia core developers, will hold this webinar, consisting of an introduction to the programming language, hands-on examples on Ookami, and lots of opportunities to ask your questions. The webinar is virtual and will take place from 2-4pm EST.

Recording
Examples

October 26th, 2022:

We are holding a webinar targeting researchers interested in using Ookami. We will cover Ookami in general, A64FX, as well as how to get an ACCESS allocation. The webinar is virtual and will take palce at 2pm EST.

October 1st, 2022:

From today on Ookami is officially an ACCESS  resource provider with 90% of its resource available via ACCESS.

JUly 12th, 2022:

Join us for the BoF session NSF innovative computing technology testbed community exchange at the PEARC 2022 conference. 1:30pm - 2:20PM EST at studio 2.

Read the report

Also visit our project partners from the University at Buffalo and hear about XDMoD and ACCESS (where Ookami will be a resource provider):

  • 2022-07-11: 0830-1700: Arlington: "Open OnDemand, Open XDMoD, and ColdFront - an HPC center management toolset"
  • 2022-07-12: 1030-1100: Arlington: "Performance Optimization of the Open XDMoD Datawarehouse - best Full paper"
  • 2022-07-12: 1130-1145: Studio 2: "Developing Accurate Slurm Simulator"
  • 2022-07-12: 1200-1330 :The Square: "Meeting ACCESS: an opportunity to discuss the new NSF Cyberinfrastructure with the ACCESS PIs"
  • 2022-07-12: 1330-1430: Studio 1: "Open OnDemand User Group Meeting" (BoF)
  • 2022-07-13: 1330-1430: Studio 1: "XDMoD" (BoF)
  • 2022-07-14: 0900-1000: Arlington: "ColdFront HPC Resource Allocation Management System User Group Meeting" (BoF)
  • 2022-07-14: 1000-1230: Ballroom A: "ACCESSing the Future of Research Computing: A Panel Discussion with Principals of the National Sciene"

JUNE 15th, 2022:

At 2pm EST we will host a general webinar about the Ookami cluster. This will provide an overview of the system, its characteristics and features, as well as insights on what applications benefit from the A64FX architecture. The webinar is targeted especially for XSEDE users, who are considering submitting an allocation proposal in the upcoming submission period (June 15th - July 15th) and allocations starting in October.

May 19th, 2022:

James Custer from HPE will hold a webinar about the Cray Performance Tools and how to use them on Ookami. The webinar will take place from 2pm - 4pm EST with enough time for questions.
The Cray Performance Measurement and Analysis Tools (or CrayPat) are a suite of optional utilities that enable the user to capture and analyze performance data generated during the execution of a program on a Cray system. The information collected and analysis produced by use of these tools can help the user to find answers to two fundamental programming questions: How fast is my program running? and How can I make it run faster?

Recording

May 17th, 2022:

On Tuesday May 17th, starting at 9:00 AM EDT, we will be performing the second and final phase of security and reliability updates. That maintenance is expected to be completed by the end of business the same day. During that outage, the Ookami login nodes and queues will not be available, and jobs will need to be restarted after the maintenance window.

[UPDATE] Ookami is back online and accessible.

May 3rd, 2022:

We will be performing security and reliability updates on Ookami starting 9:00 AM EDT on Tuesday May 3rd. The maintenance is expected to be completed by the end of business the same day. During the outage, the Ookami login nodes and queues will not be available, and jobs will need to be restarted after the maintenance window.

We apologize for the inconvenience and thank you for your patience while we perform these necessary updates.

[UPDATE] Ookami is back online and accessible.

March 24th, 2022:

A general Ookami webinar for everybody interested in this resource took place. It covered the most important topics from technical details about A64FX to the allocation process and the XSEDE integration.

February 10th, 2022:

From 2pm - 5pm EST the Ookami user group meeting took place. Read more

January 13th, 2022:

From 2pm - 4pm EST a webinar about Chapeltook place. The recording and slides can be found here.

Tired of mixing multiple programming notations and models to make use of HPC systems?  Come hear about Chapel, a productive, unified language for scalable parallel programming that is being used for scalable applications as diverse as Computational Fluid Dynamics and interactive Data Science.

November 15th-19th, 2021:

The SC conference took place.
There were three talks about the work done on Ookami:

  • Su November 14, 9.28-944am CST, HPCSysPros Workshop: Lightning Talk: Ookami – The First Year of a Computing Technology Testbed
  • Su November 14, 11- 11.15am CST, EduHPC Workshop: Educating HPC Users in the use of advanced computing technology
  • We November 17, 5.50 - 5.58pm CST, Arm HPC user group BoF session: One year of operations - experiences on the Ookami A64FX testbed

November 11th-12th, 2021:

SC 2021 - ARM HPC user group hackathon. Ookami provides guest accounts for participants.

October 27th, 2021

The thirteenth annual concurrent collections workshop will take place.There will be a talk about Ookami

We, 27 October 2.45 - 3.15pm "Introduction to Ookami supercomputer"

October 7th, 2021

We will be performing security and reliability updates on Ookami starting 9:00 AM EST on Thursday October 7th. The maintenance is expected to be completed by the end of business the same day. During the outage, the Ookami login nodes and queues will not be available, and running jobs will need to be restarted after the maintenance window.

We apologize for the inconvenience and thank you for your patience while we perform these necessary updates.

[UPDATE] Ookami is back online and accessible.

July 27TH, 2021

We had a webinar  from HPC@FAU. It will cover their tools likwid, OSACA, and the roofline-model for SpMV.

July 22TH, 2021

Join panelist and IACS Director Robert Harrison on July 22 at 9:00 PDT / 12:00 EDT / 17:00 BST, for The Next Platform: The Future of Supercomputing is Happening Now, taking place on The Next Platform’s homepage.

July 13TH, 2021

We will had a webinar about the Parallelware Analyzer by Appentra. Watch the recording here.

June 4TH, 2021

Ookami is currently not accessible. We are investigating the issue and will keep you posted.

[UPDATE] Ookami is back online and accessible.

May 25TH, 2021

Login1 will be rebooted at 1pm. Please log off by 12:45pm.

[UPDATE] Login1 is running and Ookami is accessible again.

APRIL 28TH, 2021

At 1:30pm EST we had an introduction to Ookami. This webinar is mainly for people interested in the system and thinking about getting accounts on Ookami.
Find the slides and recordings in our documentation section.

APRIL 20TH, 2021

We had a workshop on TAU from 10am - noon taking place within the weekly hakathon.
TAU Performance System® is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, UPC, Java, Python.
Find the recordings in our documentation section

MARCH 23rd, 2021

We had an introduction to XDMoDfrom 10 - 10.30 am. It will take place within the weekly hakathon.
Recordings are available here.

March 3rd, 2021

The Ookami webinar will give an introduction to the system and demonstrate its usage. Everybody interested in Ookami can join via zoom.
03/03/2021: 2 pm - 3 pm EST
Find the slides here

FEBRUARY 24TH - 26TH, 2021

We are delighted to announce, that there will be a hands-on training session with arm. The topics will cover the toolchains from arm, Fujitsu, Cray and open source, tuning for A64FX, and many more. This will enable you to get the most out of your computations.
The sessions will be on
02/24/2021: 1 pm – 3 pm EST
02/25/2021: 1 pm – 3 pm EST
02/26/2021: 1 pm – 3 pm EST
You can find the material of the hackathon here.

JanUARY 26Th, 2021:

We are happy to announce, that the Ookami had a successful review of the 1st project year. The system is accepted and all vendors are paid. Join us in exploring the system's capabilities and get an account.

January 25TH, 2021:

We will be performing security and reliability updates on Ookami starting at 9:00 AM EST on Thursday January 28th. The maintenance is expected to be completed by noon the same day. During the outage, the Ookami login nodes and queues will not be available, and running jobs will need to be restarted after the maintenance window.

 We apologize for the inconvenience and thank you for your patience while we perform these necessary updates.

January 4th, 2021:

SiegmannThe Ookami lead research scientist Dr. Eva Siegmann joined our team. Eva holds a Ph.D. in applied mathematics. She is a specialist in scientific computing with a strong background in particle simulations. Since 2012 Eva is working in academia and on industrial projects with leading companies and research groups. Before coming to SBU she was located in Austria.
Eva is looking forward to support the Ookami users and have fun with the system.

DECEMBER 28th, 2020:

The Ookami cluster is currently inaccessible. We are working on a resolution and will provide another update once the issue is resolved.

[UPDATE] The issue has been resolved and Ookami is now once again accessible.

In addition, we were just informed of scheduled electrical maintenance at the CEWIT Data Center on the circuits powering the Ookami cluster. In anticipation of this necessary maintenance, the Ookami cluster will be going down on Tuesday 12/29/2020 at the end of business and coming back online by noon on Wednesday 12/30/2020. We thank you for your patience while these important updates are completed

 December 21st, 2020:

Due to network issues, the Ookami cluster is currently inaccessible. We are working with the networking teams on a resolution and will provide another update once the issue is resolved.

[UPDATE] Network issues now resolved.

November 19th, 2020: 

In order to upgrade our storage system, we will have scheduled maintenance on the Ookami cluster starting at 10 AM on Monday November 23rd. The maintenance is expected to be completed by the start of business on Tuesday November 24th.

During this outage, all Ookami Login nodes and queues will be unavailable. We apologize for the inconvenience and thank you for your patience while we work through these necessary upgrades.