Thursday, January 23, 2025

OpenACC To OpenMP Migration: Harness Intel CPUs And GPUs

- Advertisement -

Intel CPUs and GPUs: Achieving High Performance Parallelism. Simple Code Migration from OpenACC to OpenMP.

Workloads may be offloaded to GPUs from NVIDIA, AMD, and a few other vendors using the OpenACC (for Open Accelerators) parallel programming standard. Although it supports C and C++ programming languages, it is mostly utilised for Fortran applications. This framework is lightweight and solely concentrates on GPU and accelerator offload parallelism.

- Advertisement -

Introduced in 1997, OpenMP is a more comprehensive, feature-rich, and adaptable framework for introducing parallelism to C/C++ and Fortran workloads. It is not just for GPU offload because it started out as an open multi-architecture standard for effective parallel programming across CPUs. Thus, the ability to utilise the capabilities of the newest Intel technology, such as CPUs, GPUs, and other accelerators, is a significant benefit that OpenMP offers developers but OpenACC does not. OpenMP has steadily expanded its GPU and other accelerator offload capabilities since the introduction of version 4.0 in 2013. It keeps changing swiftly.

Although OpenMP is a little heavier paradigm than OpenACC, its open, feature-rich, standard-based, and extremely configurable code parallelization architecture greatly outweighs this. It is extensively used throughout the whole ecosystem of developers for high performance computing, enabling you to benefit from Intel’s most recent integrated accelerators, architectural innovations, and GPU lineup.

In order to fully utilize OpenMP’s parallel programming capabilities, we will go over three code samples in this article that show how to convert your C/C++ code based on OpenACC directives to OpenMP using the Intel oneAPI Base Toolkit and Intel Application Migration Tool for OpenACC to OpenMP API.

An Overview of the Intel Application Migration Tool for OpenACC to OpenMP

OpenACC constructs in C/C++ and Fortran code are automatically converted to the proper OpenMP constructs by the Intel Application Migration Tool. It is a Python 3 utility that migrates code using offloading methods. The tool, which reflects the constantly shifting distinctions between the most recent OpenACC and OpenMP standards, is a component of an ongoing effort. It flourishes when it is not dependent on the infrastructure of any one compiler.

- Advertisement -

Because of this, it might not execute optimally on a particular piece of hardware, but it guarantees semantically accurate equivalent translation of OpenACC expressions to OpenMP. To improve and optimise the application speed and efficiency for the specific hardware platform you are targeting, you might wish to manually adjust the output after analysing the ported output code using performance analysis/debugging tools like Intel VTune Profiler.

However, this is a little cost for a tool that may be used anywhere, such as the Intel Application Migration Tool for OpenACC to OpenMP.

Concerning the Code Samples

In this post, we will go over three code samples that demonstrate how to use the Intel Application Migration Tool to transfer OpenACC components from C/C++ code to OpenMP. The migration utility creates two output files for a specific input file in each sample:

  • OpenACC directives are changed to OpenMP ones in a.translated file.
  • An explanation in a.report file
  • Which OpenACC constructions could not be migrated, which OpenACC clauses were converted to which OpenMP clauses, and where each migrated or unmigrated construct is located in the code.

It also recommends comparable OpenMP structures for the unmigrated constructions, which you may manually implement.

The Intel Fortran Compiler and Intel oneAPI DPC++/C++ Compiler come with the OpenMP runtime library. The three code samples that are covered here make use of the Intel oneAPI DPC++/C++ compiler that is included in the Intel oneAPI Base Toolkit since they are written in the C programming language. The hardware utilized for GPU-offload and concurrent code sample execution is Intel Core Processors with integrated Intel Graphics and Intel Data Centre GPU Max Series. These and other workloads, however, can readily be moved to other Intel architectures of your choosing.

Sample of Atoms

Multiple threads can change a shared numerical variable thanks to OpenMP’s capability for atomic operations. Because each atomic action only applies to the single assignment statement that comes after it, it is thread-safe, prohibiting multiple threads from reading or writing to the same variable at the same time without producing an uncertain value. Thus, fine-grained synchronisation is made possible via atomic operations.

In order to offload the source code to Intel GPUs, the Atomic Sample shows how to read, write, update, and capture atomic clauses from OpenACC to OpenMP as well as migrate the parallel loop. Installing and using the Intel Application Migration Tool to migrate the OpenACC directives to OpenMP will be covered.

For Example:

#pragma acc atomic read
#pragma omp atomic read

GPU Sample for Monte Carlo

Using random sampling in intricate mathematical and physical systems, the Monte Carlo simulation method is a statistical methodology that aids in predicting potential outcomes of an unknown event. After modelling a system or issue, it simulates it several times to determine the range of potential outcomes. In applications including cost estimating, investment planning, project management, risk assessment, emergency response planning, and portfolio management, this approach is frequently utilized for risk forecasts and mitigations.

The Monte Carlo GPU Sample estimates the price of a European call option and the confidence interval surrounding the expected value using the Monte Carlo technique. The range of values within which the estimated value based on several simulation runs may be reliably predicted to occur is known as a confidence interval.

The majority of the OpenACC directives in the sample are converted to OpenMP using the Intel Application Migration Tool. To choose the optimal translation for the use case, certain calls must be manually adjusted because they have not yet been migrated. The untranslated OpenACC API calls and their matching closest OpenMP calls are displayed in the.report output file.

Sample of a Bilateral Filter

Bilateral filtering is a non-linear method for smoothing pictures that reduces noise and preserves edges. Large, crisp edges are maintained without softening the image, but the majority of texture, noise, and small details are removed. It accomplishes this by substituting a weighted average of the intensity values of neighbouring pixels for each pixel’s intensity.

The following three factors determine the outcome of bilateral filtering:

  • Gaussian delta, or a function that assigns greater weight to pixels with similar intensities and close geographic proximity The delicate texture is blurred by larger gaussian deltas.
  • Euclidean delta, or the Euclidean distance between two pixels’ spatial positions A larger Euclidean delta preserves the original image’s sharp outlines while removing fine texture.
  • Iterations Multiple repetitions result in a substantial flattening of the colours without causing edge blurring.

The Bilateral Filter sample shows how to convert OpenACC-based source code to OpenMP for bilateral filter implementation on Intel architectures. The migration tool translates the OpenACC kernel construct to the OpenMP target construct as there is no direct translation from OpenACC to OpenMP.

For Example:

#pragma acc loop independent, gang
#pragma omp loop order(concurrent)

What Comes Next?

Explore the OpenACC to OpenMP code samples in further detail and discover how to use the Intel Application Migration Tool to move your OpenACC code to OpenMP with the fewest possible code changes. Take advantage of high-performance parallelism among Intel CPUs and GPUs by starting with the Intel oneAPI DPC++/C++ Compiler and the Intel Fortran Compiler, which effectively build your code.

We also recommend that you look at additional HPC and AI solutions that are part of Intel’s software portfolio powered by the oneAPI for accelerated parallel computing that is multiarchitecture and cross-vendor.

- Advertisement -
Drakshi
Drakshi
Since June 2023, Drakshi has been writing articles of Artificial Intelligence for govindhtech. She was a postgraduate in business administration. She was an enthusiast of Artificial Intelligence.
RELATED ARTICLES

Recent Posts

Popular Post

Govindhtech.com Would you like to receive notifications on latest updates? No Yes