Commands Reference¶
Use this page when you remember the task but not the exact command. Commands are grouped by workflow and link back to the module where they first appear.
Getting Around¶
I want to… |
Command |
|---|---|
Show the current machine name |
|
Show my username |
|
Show my current directory |
|
Change into a module exercise directory |
|
List files |
|
List newest output files first |
|
Print a file |
|
Watch a job output file update |
|
Cluster And Filesystem¶
I want to… |
Command |
|---|---|
Show home directory path |
|
Show work directory path |
|
Check home directory disk space |
|
Check work directory disk space |
|
Show memory usage |
|
Show CPU details |
|
Show just the CPU model |
|
See Module 1 for the first cluster walkthrough.
Software Modules¶
I want to… |
Command |
|---|---|
List loaded modules |
|
List available modules |
|
Inspect the tutorial base module |
|
Note
The tutorial environment is preconfigured. Do not run module purge unless an
instructor tells you to.
Slurm Jobs¶
I want to… |
Command |
|---|---|
Submit a batch job |
|
See my jobs |
|
See all jobs on the tutorial partition |
|
See partition and node status |
|
See detailed node status |
|
Get details for one job |
|
See accounting info after a job finishes |
|
Cancel one job |
|
Cancel all my jobs |
|
See Module 2 for the Slurm walkthrough.
Quick Compute-Node Commands¶
I want to… |
Command |
|---|---|
Run |
|
Check CPU model on a compute node |
|
Preview ROCm GPU info on a compute node |
|
Run a program on one compute node |
|
Run with 4 MPI ranks |
|
Run with 16 CPU cores for one task |
|
Slurm Script Directives¶
Directive |
Meaning |
|---|---|
|
Name shown in the Slurm queue |
|
Use the tutorial partition |
|
Request one node |
|
Request one task or MPI rank |
|
Request CPU cores for one task |
|
Set the job time limit |
|
Write stdout to a file; |
|
Write stderr to a file |
Python Environment¶
I want to… |
Command |
|---|---|
Create the tutorial virtual environment from the repo root |
|
Create the venv from inside a module exercise directory |
|
Activate the venv |
|
Verify PyTorch |
|
Verify ROCm is visible to PyTorch |
|
Leave the venv |
|
C And OpenMP¶
I want to… |
Command |
|---|---|
Compile a plain C program |
|
Compile a C program that uses math functions |
|
Compile an OpenMP program |
|
Set the OpenMP thread count |
|
Run OpenMP with 4 threads through Slurm |
|
Useful OpenMP snippets from Module 3:
OpenMP construct |
Purpose |
|---|---|
|
Start a parallel region |
|
Split loop iterations across threads |
|
Start threads and split a loop in one directive |
|
Safely combine per-thread values into |
|
Get the current thread ID |
|
Get the size of the current thread team |
MPI¶
I want to… |
Command |
|---|---|
Compile an MPI C program |
|
Compile an MPI C program using math functions |
|
Run with 4 ranks |
|
Run with 16 ranks |
|
Useful MPI calls from Module 4:
MPI call |
Purpose |
|---|---|
|
Initialize MPI |
|
Shut down MPI |
|
Get this rank’s ID |
|
Get the number of ranks |
|
Send data to another rank |
|
Receive data from another rank |
|
Combine values across ranks |
|
Broadcast a value from one rank to all ranks |
HIP And GPU Tools¶
I want to… |
Command |
|---|---|
Check whether |
|
Show the HIP compiler version |
|
Show ROCm system info |
|
Show GPU memory usage |
|
Compile a HIP program |
|
Compile with a block-size macro |
|
Run a HIP program on a compute node |
|
Useful HIP calls and syntax from Module 5:
HIP item |
Purpose |
|---|---|
|
Allocate memory on the GPU |
|
Copy data between host and GPU |
|
Free GPU memory |
|
Wait for GPU work to finish |
|
Check for kernel launch errors |
|
Declare a GPU kernel |
|
Launch a kernel |
|
Thread ID within a block |
|
Block ID within the grid |
|
Threads per block |
AI Exercises¶
I want to… |
Command |
|---|---|
Run the inference batch script |
|
Read inference output |
|
Run the fine-tuning batch script |
|
Read fine-tuning output |
|
Load a text generation model in Python |
|
Load a tokenizer in Python |
|
Move a PyTorch model to the GPU |
|
Generate text with a model |
|
See Module 6 for inference and fine-tuning details.
AI Agent¶
I want to… |
Command |
|---|---|
Show the shared agent server URL |
|
Export the agent server URL for interactive testing |
|
Run the agent exercise |
|
Launch the tutorial-provided CLI agent |
|
Launch aider from an arbitrary working directory |
|
See Module 7 for the agent exercises.
Common Output Files¶
Job or exercise |
Output file pattern |
|---|---|
Venv setup |
|
Slurm first job |
|
Compute-node hello |
|
OpenMP pi |
|
MPI sum |
|
HIP exercise |
|
Inference |
|
Fine-tuning |
|
Agent exercise |
|