Содержание

Параллельная обработка, параллелизм и асинхронное программирование в .NET Parallel Processing, Concurrency, and Async Programming in .NET
Содержание In This Section
Параллельное программирование в .NET Parallel Programming in .NET
Water Programming: A Collaborative Research Blog
Tips and tricks on programming, evolutionary algorithms, and doing research
Parallel processing with R on Windows
Parallel processing windows OS
Parallel processing using TPL in windows service
2 Answers 2

Параллельная обработка, параллелизм и асинхронное программирование в .NET Parallel Processing, Concurrency, and Async Programming in .NET

.NET предоставляет ряд способов написания асинхронного кода, позволяющего уменьшить время отклика приложения на действия пользователя, а также написания параллельного кода, который использует несколько потоков выполнения, чтобы повысить производительность компьютера. .NET provides several ways for you to write asynchronous code to make your application more responsive to a user and write parallel code that uses multiple threads of execution to maximize the performance of your user’s computer.

Содержание In This Section

Асинхронное программирование Asynchronous Programming
Описываются механизмы асинхронного программирования, предоставляемые .NET. Describes mechanisms for asynchronous programming provided by .NET.

Параллельное программирование Parallel Programming
Описывает модель программирования, основанную на задачах, которая упрощает разработку параллельных приложений, позволяя писать эффективный, точный и масштабируемый параллельный код естественным образом без необходимости работать непосредственно с потоками или пулом потоков. Describes a task-based programming model that simplifies parallel development, enabling you to write efficient, fine-grained, and scalable parallel code in a natural idiom without having to work directly with threads or the thread pool.

Работа с потоками Threading
Описываются основные механизмы параллелизма и синхронизации, предоставляемые .NET. Describes the basic concurrency and synchronization mechanisms provided by .NET.

Параллельное программирование в .NET Parallel Programming in .NET

Многие персональные компьютеры и рабочие станции имеют несколько ядер ЦП, которые позволяют одновременно выполнять несколько потоков. Many personal computers and workstations have multiple CPU cores that enable multiple threads to be executed simultaneously. Чтобы воспользоваться преимуществами оборудования, можно параллелизовать код для распределения работы между несколькими процессорами. To take advantage of the hardware, you can parallelize your code to distribute work across multiple processors.

В прошлом распараллеливание требовало управления потоками и взаимоблокировками на низком уровне. In the past, parallelization required low-level manipulation of threads and locks. Visual Studio и .NET обеспечивают расширенную поддержку параллельного программирования, предоставляя среду выполнения, типы библиотек классов и средства диагностики. Visual Studio and .NET enhance support for parallel programming by providing a runtime, class library types, and diagnostic tools. Эти возможности, которые впервые появились в .NET Framework 4, упрощают параллельную разработку. These features, which were introduced in .NET Framework 4, simplify parallel development. Это позволяет разработчикам писать эффективный, детализированный и масштабируемый параллельный код с помощью естественных выразительных средств без необходимости непосредственной работы с потоками или пулом потоков. You can write efficient, fine-grained, and scalable parallel code in a natural idiom without having to work directly with threads or the thread pool.

На рисунке ниже представлен общий обзор архитектуры параллельного программирования в .NET. The following illustration provides a high-level overview of the parallel programming architecture in .NET.

Water Programming: A Collaborative Research Blog

Tips and tricks on programming, evolutionary algorithms, and doing research

Parallel processing with R on Windows

Parallel programming can save you a lot of time when you are processing large amounts of data. Modern computers provide multiple processors and cores and hyper-threading ability; therefore, R has become compatible with it and enables multiple simultaneous computations on all resources. There are some discussions regarding when to parallelize, because there is no linear relationship between the number of processors and cores used simultaneously and the computational timing efficiency. In this blog post, I am going to utilize two packages in R, which allows parallelization, for a basic example of when each instance of computation is standalone and when there is no need for communication between cores that are being used in parallel.

If you enter Ctrl+Shift+Esc on your keyboard and click on the Performance tab in the Task Manager window, you will see how many actual logical processes, which are the combination of processors and cores, are available on your local Windows machine and can be used simultaneously for your analysis. We can also detect this number with the following command:

Now, we need to allocate this number of available cores to the R and provide a number of clusters and then register those clusters. If you specify all the cores to the R, you may have trouble doing anything else on your machine, so it is better not to use all the resources in R.

First, we are going to create a list of files that we want to analyze. You can download the example dataset here.

Then, we will create a function that we are will use for processing our data. Each file in this set has a daily value for several variables and 31 years. Columns 1, 2, and 3 are year, month, and day, respectively. I am going to extract the yield value for each year from column “OUT_CROP_BIOMYELD” and calculate the average yield for the entire period. All the libraries that you are going to use for your data process should be called inside the function. I am going to use “data.table” library to efficiently read my data into R. At the end of the function, I put the two outputs that I am interested in (“annual_yield” and “average_yield”) into one list to return from the function.

Now, we need to export our function on the cluster. Because in the function we used “all_samples” data-frame, which was created outside the function, this should also be exported to the cluster:

With the command line below, we are running the function across the number of cores that we specified earlier, and with “system.time,” the process time will be printed at the end:

The function outputs are saved in the list “results” that we can extract:

Parallel processing windows OS

I am new to parallel processing. I have read through «High performance and parallel computing with R» at CRAN. and have come to understand the difference between implicit and explicit parallelism and that of distributed computing vs embarransingly parallel problems and read about foreeach, multicore, snowfall,package doSMP (retired from CRAN) etc

I have windows as OS, unfortunatelly installing other OS is not an option.

I would like to explain the structure of my code as to ask your opinion on the possibilities to implement parallel computing. Package used is caret.

1/3 of my code is to a large degree sequential and not possible to parallelize and 2/3 is mostly iterative with limited sequentiality (results have to be communicated once at the end of the processes and some data has to be taken from prior sequential results to be used by the process), this corresponds to the training of a model embeded within a conditional expression below. (can use from 2 to 10 folds and repetitions).

The whole code has to be repeated 50 times with only X changing.

My questions are below.

What would be the options in respect to packages needed to implement parallel processing using multiple cores from a windows OS. rParallel (seems to be one) doSMP another

Given the explained structure of 2/3 the code which seems to be paralletizable, would you be of the opinion to run it as an embarrasignly parallel problem or opt for R serial loops given the few folds and repetitions

How would you approach repeating 50 times the same sequence of code with one minor change. In parallel or using simultaneously different R instances (for example in baches). In this last case how would you combine the results of running multiple R instances.

Note that the package doSMP is built into the core build of Revolution R, so you don’t have to install it from CRAN (for this reason, you won’t find it on the package list). All you have to do is load it with require(SMP) revoIPC package is also necesary. This package does no compile on modern versions of gcc and hence has been archived, doSMP depends on it and therefore went to the archives as well. Unsure if you can or you can not download the mentioned packages from REvolutions without beeing a customer

Parallel processing using TPL in windows service

I have a windows service which is consuming a messaging system to fetch messages. I have also created a callback mechanism with the help of Timer class which helps me to check the message after some fixed time to fetch and process. Previously, the service is processing the message one by one. But I want after the message arrives the processing mechanism to execute in parallel. So if the first message arrived it should go for processing on one task and even if the processing is not finished for the first message still after the interval time configured using the callback method (callback is working now) next message should be picked and processed on a different task.

Below is my code:

But using the Task Factory I am not able to control the number of tasks in parallel so in my case I want to configure the number of tasks on which messages will run on the availability of the tasks?

Update: Updated my above code to add multiple tasks

Below is the code:

2 Answers 2

I think what your looking for will result in quite a large sample. I’m trying just to demonstrate how you would do this with ActionBlock . There’s still a lot of unknowns so I left the sample as skeleton you can build off. In the sample the ActionBlock will handle and process in parallel all your messages as they’re received from your messaging system

Ok, sorry I’m short on time but here’s the general idea/skeleton of what I was thinking as an alternative.

If I’m honest though I think the ActionBlock is the better option as there’s just so much done for you, with the only limit being that you can’t dynamically scale the amount of work it will do it once, although I think the limit can be quite high. If you get into doing it this way you could have more control or just have a kind of dynamic amount of tasks running but you’ll have to do a lot of things manually, e.g if you want to limit the amount of tasks running at a time, you’d have to implement a queueing system (something ActionBlock handles for you) and then maintain it. I guess it depends on how many messages you’re receiving and how fast your process handles them.

You’ll have to check it out and think of how it could apply to your direct use case as I think some of the details area a little sketchily implemented on my side around the concurrentbag idea.

So the idea behind what I’ve thrown together here is that you can start any number of tasks, or add to the tasks running or cancel tasks individually by using the collection.

The main thing I think is just making the method that the Callback runs fire off a thread that does the work, instead of subscribing within a separate thread.

I used Task.Factory.StartNew as you did, but stored the returned Task object in an object ( TaskInfo ) which also had it’s CancellationTokenSource, it’s Id (assigned externally) as properties, and then added that to a collection of TaskInfo which is a property on the class this is all a part of:

Updated — to avoid this being too confusing i’ve just updated the code that was here previously.

You’ll have to update bits of it and fill in the blanks in places like with whatever you have for my HeartbeatController , and the few events that get called because they’re beyond the scope of the question but the idea would be the same.

Hope this gives you an idea of how else you can approach your problem and that I didn’t miss the point :).

Update

To use events to update progress or which tasks are processing, I’d extract them into their own class, which then has subscribe methods on it, and when creating a new instance of that class, assign the event to a handler in the parent class which can then update your UI or whatever you want it to do with that info.

So the content of Process() would look more like this:

Parallel processing in windows