Run openmp on windows

Instructions on how to run MPI, OpenMP and CUDA programs

Sachin Kumawat and Norm Matloff

This is a quick overview on running parallel applications with MPI, OpenMP and CUDA. The recommended platform is Unix (includes Linux and Mac OS X) and useful (but untested!) links are accordingly provided for Windows Tools as well. These instruction should be followed to gain familiarity with the tools before starting with actual assignments and taking quizzes.

CRUCIAL NOTE: When you take the quizzes, the various executables, gcc, mpicc, mpiexec, R an python must be in your search path, as the OMSI script will invoke them. A similar statement holds for library paths. Thus it is absolutely essentially that you do a dry run of OMSI before the first quiz.

Both MPI (mpicc, mpiexec) and CUDA (nvcc) toolchains are installed on CSIF machines.

Your Laptop

For our quizzes, you will need gcc, MPI and R for running code, and Python for running our OMSI quiz tool. CUDA for quizzes will just be «pencil and paper» style, no actual compiling/running.

As noted, your version of gcc must be OpenMP-capable. To test that, download omp_hello.c, compile and run:

This will probably fail on a Mac; see below for the remedy.

For the programming assignments, you will also need gcc, MPI and R.

If you have a CUDA-compatible video card, you may install CUDA but be prepared for some obstacles to resolve. Installation can be performed by following instructions from CUDA Toolkit’s homepage. The setup is rather involved but majority of the issues are discussed here.

Installation of OpenMP capable C/C++ compiler and MPI tools

Details

Unix-family systems

Windows

On windows, a version of MPICH called MSMPI can be used along with Visual Studio. Download directly from Microsoft’s Website. Compilation and launch instructions are provided here.

Set Up Remote Authentication:

MPI implementations work by invoking programs on other nodes via ssh or equivalent daemon. Therefore before you can run MPI programs, it is required to once setup passwordless login from one MPI machine to another. To set up passwordless login on CSIF (or any) systems, check FAQ 7.9 and 7.10 of csif-general-faq.

Compiling MPICH2 Program:

To compile a MPI program written in C, type:

For example, for a program PrimePipe.c, make an executable prp this way:

(You may need to specify the full path to prp.)

(If you wish to use C++, use mpicxx instead of mpicc.)

Running MPICH2 application:

Set up a hosts file, listing which machines you wish your MPI app to run on, e.g. hosts3:

Run, say for the above executable named prp on the above hosts file, by typing

where 100 and 0 are the command-line arguments to prp.

The GNU gcc/g++ compilers are capable of running programs with OpenMP directives right out of the box. Therefore no installation/configuration is required for Linux systems (except for OS X, see below). To enable OpenMP support for a program hello_openmp.c, simply compile with the flag -fopenmp as:

OpenMP for Windows:

Visual Studio support for OpenMP is outdated, hence it is recommended to utilize GCC functionality on Windows by installing either Cygwin or MinGW. For Visual Studio, instructions to enable OpenMP support are provided here.

OpenMP for Mac OS X:

The default clang compiler on OS X does not support OpenMP. Since gcc on OS X is just a symbolic link to clang, using the default gcc/g++ will not work either. We need to install the latest homebrew version of gcc (e.g. v6.x) and add its location to the PATH environment variable: The OpenMP program can then be compiled with:

Читайте также:  Discord times windows 10

Note that you will need to alias gcc to gcc-6.

The install took 85 minutes when I tried it. Note that it will install in /usr/local/Cellar. Also, I found that I also needed to make sure that /usr/bin/as is ahead of /opt/local/bin in PATH.

(Use g++/g++-6 for C++ applications.)

To run an OpenMP application, first specify the number of threads using OMP_NUM_THREADS enviroment variable. For example, to launch 8 threads, type: under tcsh, similarly for bash. If OMP_NUM_THREADS is not set, by default as many threads as available cores are launched. Now simply run the executable to run the application:

CUDA is installed on CSIF systems at /usr/local/cuda-8.0 and you can obtain details about the installed GPU card on a particular system by typing nvidia-smi on terminal. CUDA is compiled by invoking nvcc compiler. It links with all CUDA libraries and also calls gcc to link with the C/C++ runtime libraries. A CUDA program hello_cuda.cu, which contains both host and device code, can simply be compilled and run as:

CUDA for Windows:

Visial Studio provides support to directly compile and run CUDA applications. Instructions for installation and sample program execution can be found here.

/openmp (Enable OpenMP Support)

Causes the compiler to process #pragma omp directives in support of OpenMP.

Syntax

/openmp
/openmp:experimental
/openmp:llvm

Remarks

#pragma omp is used to specify Directives and Clauses. If /openmp isn’t specified in a compilation, the compiler ignores OpenMP clauses and directives. OpenMP Function calls are processed by the compiler even if /openmp isn’t specified.

The C++ compiler currently supports the OpenMP 2.0 standard. However, Visual Studio 2019 also now offers SIMD functionality. To use SIMD, compile by using the /openmp:experimental option. This option enables both the usual OpenMP features, and OpenMP SIMD features not available when using the /openmp switch.

Starting in Visual Studio 2019 version 16.9, you can use the experimental /openmp:llvm option instead of /openmp to target the LLVM OpenMP runtime. Support currently isn’t available for production code, since the required libomp DLLs aren’t redistributable. The option supports the same OpenMP 2.0 directives as /openmp . And, it supports all the SIMD directives supported by the /openmp:experimental option. It also supports unsigned integer indices in parallel for loops according to the OpenMP 3.0 standard. For more information, see Improved OpenMP Support for C++ in Visual Studio.

Currently, the /openmp:llvm option only works on the x64 architecture. The option isn’t compatible with /clr or /ZW .

Applications compiled by using both /openmp and /clr can only be run in a single application domain process. Multiple application domains aren’t supported. That is, when the module constructor ( .cctor ) is run, it detects if the process is compiled using /openmp , and if the app is loaded into a non-default runtime. For more information, see appdomain , /clr (Common Language Runtime Compilation), and Initialization of Mixed Assemblies.

If you attempt to load an app compiled using both /openmp and * /clr* into a non-default application domain, a TypeInitializationException exception is thrown outside the debugger, and a OpenMPWithMultipleAppdomainsException exception is thrown in the debugger.

These exceptions can also be raised in the following situations:

If your application is compiled using /clr but not /openmp , and is loaded into a non-default application domain, where the process includes an app compiled using /openmp .

If you pass your /clr app to a utility, such as regasm.exe, which loads its target assemblies into a non-default application domain.

The common language runtime’s code access security doesn’t work in OpenMP regions. If you apply a CLR code access security attribute outside a parallel region, it won’t be in effect in the parallel region.

Microsoft doesn’t recommend that you write /openmp apps that allow partially trusted callers. Don’t use AllowPartiallyTrustedCallersAttribute, or any CLR code access security attributes.

To set this compiler option in the Visual Studio development environment

Open the project’s Property Pages dialog box. For details, see Set C++ compiler and build properties in Visual Studio.

Expand the Configuration Properties > C/C++ > Language property page.

Читайте также:  Archos 90 cesium linux

Modify the OpenMP Support property.

To set this compiler option programmatically

Example

The following sample shows some of the effects of thread pool startup versus using the thread pool after it has started. Assuming an x64, single core, dual processor, the thread pool takes about 16 ms to start up. After that, there’s little extra cost for the thread pool.

When you compile using /openmp , the second call to test2 never runs any longer than if you compile using /openmp- , as there’s no thread pool startup. At a million iterations, the /openmp version is faster than the /openmp- version for the second call to test2. At 25 iterations, both /openmp- and /openmp versions register less than the clock granularity.

If you have only one loop in your application and it runs in less than 15 ms (adjusted for the approximate overhead on your machine), /openmp may not be appropriate. If it’s higher, you may want to consider using /openmp .

/openmp (Включить поддержку OpenMP) /openmp (Enable OpenMP Support)

Заставляет компилятор обрабатывать #pragma omp директивы для поддержки OpenMP. Causes the compiler to process #pragma omp directives in support of OpenMP.

Синтаксис Syntax

/openmp
/openmp:experimental
/openmp:llvm

Remarks Remarks

#pragma omp используется для указания директив и предложений. #pragma omp is used to specify Directives and Clauses. Если параметр /openmp не указан при компиляции, компилятор игнорирует предложения и директивы OpenMP. If /openmp isn’t specified in a compilation, the compiler ignores OpenMP clauses and directives. Вызовы функций OpenMP обрабатываются компилятором, даже если /openmp не указаны. OpenMP Function calls are processed by the compiler even if /openmp isn’t specified.

Компилятор C++ в настоящее время поддерживает стандарт OpenMP 2,0. The C++ compiler currently supports the OpenMP 2.0 standard. Однако теперь Visual Studio 2019 также предлагает функции SIMD. However, Visual Studio 2019 also now offers SIMD functionality. Чтобы использовать SIMD, Скомпилируйте с помощью /openmp:experimental параметра. To use SIMD, compile by using the /openmp:experimental option. Этот параметр включает как обычные функции OpenMP, так и функции OpenMP SIMD, недоступные при использовании /openmp параметра. This option enables both the usual OpenMP features, and OpenMP SIMD features not available when using the /openmp switch.

Начиная с Visual Studio 2019 версии 16,9 можно использовать экспериментальный /openmp:llvm вариант вместо /openmp для целевой среды выполнения OpenMP LLVM. Starting in Visual Studio 2019 version 16.9, you can use the experimental /openmp:llvm option instead of /openmp to target the LLVM OpenMP runtime. В настоящее время поддержка в рабочем коде недоступна, так как требуемые библиотеки DLL либомп не являются распространяемыми. Support currently isn’t available for production code, since the required libomp DLLs aren’t redistributable. Параметр поддерживает те же директивы OpenMP 2,0, что и /openmp . The option supports the same OpenMP 2.0 directives as /openmp . И поддерживают все директивы SIMD, поддерживаемые /openmp:experimental параметром. And, it supports all the SIMD directives supported by the /openmp:experimental option. Он также поддерживает параллельные индексы целых чисел без знака в соответствии со стандартом OpenMP 3,0. It also supports unsigned integer indices in parallel for loops according to the OpenMP 3.0 standard. Дополнительные сведения см. в статье Улучшенная поддержка OpenMP для C++ в Visual Studio. For more information, see Improved OpenMP Support for C++ in Visual Studio.

В настоящее время /openmp:llvm параметр работает только в архитектуре x64. Currently, the /openmp:llvm option only works on the x64 architecture. Параметр несовместим с /clr или /ZW . The option isn’t compatible with /clr or /ZW .

Приложения, скомпилированные с помощью /openmp и, /clr могут выполняться только в одном процессе домена приложения. Applications compiled by using both /openmp and /clr can only be run in a single application domain process. Несколько доменов приложений не поддерживаются. Multiple application domains aren’t supported. Это значит, что при запуске конструктора модуля ( .cctor ) он обнаруживает, компилируется ли процесс с помощью /openmp , и если приложение загружается в среду выполнения, не используемую по умолчанию. That is, when the module constructor ( .cctor ) is run, it detects if the process is compiled using /openmp , and if the app is loaded into a non-default runtime. Дополнительные сведения см. в статьях appdomain , /clr (компиляция среды CLR)и Инициализация смешанных сборок. For more information, see appdomain , /clr (Common Language Runtime Compilation), and Initialization of Mixed Assemblies.

Читайте также:  Телефоны с программным обеспечением windows

При попытке загрузить приложение, скомпилированное с помощью /openmp и, и * /clr* в домен приложения, не заданный по умолчанию, TypeInitializationException исключение создается вне отладчика, а OpenMPWithMultipleAppdomainsException в отладчике создается исключение. If you attempt to load an app compiled using both /openmp and * /clr* into a non-default application domain, a TypeInitializationException exception is thrown outside the debugger, and a OpenMPWithMultipleAppdomainsException exception is thrown in the debugger.

Эти исключения также могут возникать в следующих ситуациях. These exceptions can also be raised in the following situations:

Значение, если приложение компилируется с использованием /clr , но не /openmp и загружается в домен приложения, не заданный по умолчанию, где процесс включает приложение, скомпилированное с помощью /openmp . If your application is compiled using /clr but not /openmp , and is loaded into a non-default application domain, where the process includes an app compiled using /openmp .

При передаче /clr приложения в служебную программу, например regasm.exe, которая загружает целевые сборки в домен приложения, не используемый по умолчанию. If you pass your /clr app to a utility, such as regasm.exe, which loads its target assemblies into a non-default application domain.

Управление доступом для кода среды CLR не работает в регионах OpenMP. The common language runtime’s code access security doesn’t work in OpenMP regions. Если атрибут управления доступом для кода CLR применяется за пределами параллельной области, он не будет действовать в параллельной области. If you apply a CLR code access security attribute outside a parallel region, it won’t be in effect in the parallel region.

Корпорация Майкрософт не рекомендует создавать /openmp приложения, допускающие частично доверенные вызывающие объекты. Microsoft doesn’t recommend that you write /openmp apps that allow partially trusted callers. Не используйте AllowPartiallyTrustedCallersAttribute или любые атрибуты управления доступом для кода CLR. Don’t use AllowPartiallyTrustedCallersAttribute, or any CLR code access security attributes.

Установка данного параметра компилятора в среде разработки Visual Studio To set this compiler option in the Visual Studio development environment

Откройте диалоговое окно Страницы свойств проекта. Open the project’s Property Pages dialog box. Подробнее см. в статье Настройка компилятора C++ и свойства сборки в Visual Studio. For details, see Set C++ compiler and build properties in Visual Studio.

Разверните страницу Свойства > языка C/C++ > язык . Expand the Configuration Properties > C/C++ > Language property page.

Измените свойство поддержки OpenMP . Modify the OpenMP Support property.

Установка данного параметра компилятора программным способом To set this compiler option programmatically

Пример Example

В следующем примере показаны некоторые эффекты запуска пула потоков по сравнению с использованием пула потоков после его запуска. The following sample shows some of the effects of thread pool startup versus using the thread pool after it has started. При условии, что 64-разрядная, одноядерная, Двухъядерный процессор, пул потоков занимает около 16 мс для запуска. Assuming an x64, single core, dual processor, the thread pool takes about 16 ms to start up. После этого пул потоков будет немного излишним. After that, there’s little extra cost for the thread pool.

При компиляции с помощью /openmp второй вызов test2 никогда не выполняется дольше, чем при компиляции с помощью /openmp- , так как отсутствует запуск пула потоков. When you compile using /openmp , the second call to test2 never runs any longer than if you compile using /openmp- , as there’s no thread pool startup. В миллионах итераций /openmp версия выполняется быстрее, чем /openmp- версия второго вызова test2. At a million iterations, the /openmp version is faster than the /openmp- version for the second call to test2. При 25 итерациях обе /openmp- /openmp версии регистрируются меньше, чем степень гранулярности часов. At 25 iterations, both /openmp- and /openmp versions register less than the clock granularity.

Если в приложении имеется только один цикл и оно выполняется менее чем на 15 мс (оно корректируется на приблизительную нагрузку на компьютер), /openmp может быть неприемлемо. If you have only one loop in your application and it runs in less than 15 ms (adjusted for the approximate overhead on your machine), /openmp may not be appropriate. Если это более высокое значение, можно использовать /openmp . If it’s higher, you may want to consider using /openmp .

Оцените статью