This lab discusses profiling using NVIDIA® Nsight™ Systems, focusing on steps to optimize a deep neural network (DNN) training program that detects handwritten digits using a PyTorch Modified National Institute of Standards and Technology (MNIST) dataset. The techniques and strategies discussed in this lab will translate to optimizing any application that uses NVIDIA's graphic processing units (GPUs).
This content contains 4 Labs:
- Lab 1: Start the NVIDIA Nsight Systems lab
- Lab 2: PyTorch MNIST and Optimization Workflow
- Lab 3: Data Transfers between Host and GPU
- Lab 4: Tensor Core
- Lab 5: Summary
The duration of the tutorial is 2 hours.
The tools and frameworks used in this bootcamp are as follows
To deploy the Labs, please refer to the deployment guide presented here
This material originates from the OpenHackathons GitHub repository. Check out additional materials here.
Don't forget to check out additional Open Hackathons Resources and join our OpenACC and Hackathons Slack Channel to share your experience and get more help from the community.
Copyright © 2026 OpenACC-Standard.org. This material is released by OpenACC-Standard.org, in collaboration with NVIDIA Corporation, under the Creative Commons Attribution 4.0 International (CC BY 4.0). These materials may include references to hardware and software developed by other entities; all applicable licensing and copyrights apply.