1 view (last 30 days)

Show older comments

Christos Dimopoulos on 18 Feb 2021

Commented: Bjorn Gustavsson on 26 Feb 2021

Open in MATLAB Online

I tried to make the variables single to decrease the calculation time.

It's a gravitational interaction calculation between particles.

How can I vectorise this code to avoid so many loops?

Can I do it since the particles could be more than 1 million and there should be built matrixes 1million x1million cells?

Most time spent on these lines

d = [xj yj zj] - [xi yi zi]; % 3D distance as a vector

dr2 = d(1)^2+d(2)^2+d(3)^2; %3D distance as absolute value

AccelDouble = (d/sqrt(dr2))*((G*mj*(sign(mi)))/dr2); % acceleration increment

ar = ar + single(AccelDouble); % New acceleration

n = size(heavy,1);

AccelDouble = zeros(3,'double') ;

for i = 1:n

mi = heavy(i,1); % mass of first particle

if mi ~= 0 % Avoid calculate 0 mass objects

xi = heavy(i,2); % x position of first particle

yi = heavy(i,3); % y position of first particle

zi = heavy(i,4); % z position of first particle

ar = [0 0 0]; %3D

for j = 1:n % second count for reacting particles

if i ~= j % to avoid counting the same particles

mj = heavy(j,1); % mass of second particle (interacting)

if mj ~= 0 && mi ~= 0 % Avoid calculate 0 mass objects

xj = heavy(j,2); % x position of second particle

yj = heavy(j,3);

zj = heavy(j,4);

d = [xj yj zj] - [xi yi zi]; % 3D distance as a vector

dr2 = d(1)^2+d(2)^2+d(3)^2; %3D distance as absolute value

AccelDouble = (d/sqrt(dr2))*((G*mj*(sign(mi)))/dr2); % 3D acceleration increments

ar = ar + single(AccelDouble); % New 3D acceleration

end

end

end

for k = 1:3

heavy1(i,4+k) = heavy1(i,4+k) + ar(k)*timeStep; % 3D New Speed calculation

heavy1(i,k+1) = heavy1(i,k+1) + timeStep*heavy1(i,k+4); % 3D New posistion calculation

end

end

end

##### 6 Comments Show 4 older commentsHide 4 older comments

Show 4 older commentsHide 4 older comments

Rik on 18 Feb 2021

#### Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/749724-how-to-vectorise-or-speed-up-the-specific-code#comment_1338874

I don't have an answer to your actual question, but I did notice two things:

Firstly, indexing takes time as well, so creating xyz_i=heavy(i,2:4); and doing dr2=sum((heavy(j,2:4)-xyz_i).^2); might be faster (I did not test whether that is true).

Secondly, heavy1 is growing dynamically as far as we can tell.

Christos Dimopoulos on 19 Feb 2021

#### Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/749724-how-to-vectorise-or-speed-up-the-specific-code#comment_1339599

Thanks for your comment. Unfortunately this didn't seem to work.

As can be seen it created a time lag there.

I am using d function in one othr point, and as you can see although it keeps being one of the most time expensive rows, it is 5 times faster.

and I didn't have it before

Rik on 19 Feb 2021

#### Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/749724-how-to-vectorise-or-speed-up-the-specific-code#comment_1339714

Well, it was worth a try.

I still don't have a practical solution, but this sounds like something a GPU should be able to do well. I have never worked with the parallel computing toolbox or GPUArrays, but you might want to have a look.

I am thinking about a way to vectorize one of your loops. Vectorizing the whole operation might be too memory-intensive. I will post a comment or answer once I have something. Could you post code that generates example data so I can profile code myself as well?

Jan on 19 Feb 2021

#### Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/749724-how-to-vectorise-or-speed-up-the-specific-code#comment_1340259

Edited: Jan on 19 Feb 2021

Open in MATLAB Online

The iterativ growing of arrays wastes a lot of ressources. Care for a propper pre-allocation.

Please post some meaningful example data, e.g. created by rand() if this is suffcient. It is easier to examine the runtime of a function, if we can run it. And without an examination, suggesting improvements is pure guessing. Is this sufficient:

heavy = rand(1e6, 4) * 1e6

% What is heavy1?

Professional simulations have used Grape-boards, which have a hard-coded 1/x^2 function with a marvelous speedup.

Christos Dimopoulos on 19 Feb 2021

#### Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/749724-how-to-vectorise-or-speed-up-the-specific-code#comment_1340429

Open in MATLAB Online

The creation loop goes like this.

I have already tried also GPU (I am not an expert of course) but no luck up to now.

bodies = zeros(n,8);

if n > 1

for i = 1:n

m = (rand*9+1)*m0;

r = radiusOuter*(1.03-rand^1); %Distribution inside a sphere

theta = rand*(2*pi);

phi = acos(1-rand*2);

x = r*sin(phi)*cos(theta); %Spherical Distribution

y = r*sin(phi)*sin(theta);

z = r*cos(phi);

dx = 1*sin(phi)*cos(theta); % Spherical outward velocity to implement explosion

dy = 1*sin(phi)*sin(theta);

dz = 1*cos(phi);

v = r ; %initial radial outward velocity

bodies(i,1) = m;

bodies(i,2) = x;

bodies(i,3) = y;

bodies(i,4) = z;

bodies(i,5) = dx*v*(1+0.5*(0.5-rand));

bodies(i,6) = dy*v*(1+0.5*(0.5-rand));

bodies(i,7) = dz*v*(1+0.5*(0.5-rand));

bodies(i,8) = 0; % for merging

end

end

Walter Roberson on 26 Feb 2021

#### Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/749724-how-to-vectorise-or-speed-up-the-specific-code#comment_1357079

Can I do it since the particles could be more than 1 million and there should be built matrixes 1million x1million cells?

An array that was 1e6 x 1e6 would take 1e12 times the space for an individual element. For single precision that would be 4e12 bytes, which is slightly more than 3.6 terabytes, using 2^40 bytes as the definition of terabytes here.

... for each array.

Does your system have access to multiple terabytes of memory?

but this sounds like something a GPU should be able to do well.

The largest commercially available GPU is the NVIDIA Titan V, with 12 GB of memory. Considering the need for multiple matrices, you probably would not be able to handle much more than 25000 x 25000 single precision on such a system.

Sign in to comment.

Sign in to answer this question.

### Answers (2)

Bjorn Gustavsson on 19 Feb 2021

Newton says you can cut it in two if you take into account that the force from particle i on particle j is equal but directed in opposite direction of force from particle j on particle i. Also if proper pre-allocation is tricky just run the loops from n down to 1.

Finally if this is some kind of explicit Euler scheme I suggest you take a look at Størmer-Verlet schemes instead (or perhaps the 12th-10th-order-runge-kutta-nystrom-integrator that might also be an option) that ought to be more suited for equations of motion.

##### 1 Comment Show -1 older commentsHide -1 older comments

Show -1 older commentsHide -1 older comments

Christos Dimopoulos on 26 Feb 2021

#### Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/749724-how-to-vectorise-or-speed-up-the-specific-code#comment_1356359

Thank you for your answer.

You are correct that the same force value (and opposite vector) applies to both the masses but I am not sure how to put it in the code, as it seems will I have to create in each iteration a huge NxN matrix, that has limits to 10^5 particles. If we found a way without any addition in calculations we could reduce the calculation time to half!

As far as the code scheme if I am not wrong the ones that you have suggested are ways to increase precission between two steps without making less and less steps of more and more large period. We are not there yet. Also we will try some more physics changes before we proceed to complicating the code.

Sign in to comment.

Jan on 19 Feb 2021

Edited: Jan on 19 Feb 2021

Open in MATLAB Online

ar = [0 0 0];

This creates a double array.

ar = ar + single(AccelDouble);

Here AccelDouble is converted to a single at first, which must be converted back to a double, because it is added to a double. This is a waste of time.

Either convert all values to singles from the beginning. Or stay at doubles.

mi ~= 0 is excluded twice. If would be more efficient, to remove all m==0 before the loops.

"sign(mi)"? Do you mean "single(mi)"?

AccelDouble = zeros(3,'double') ;

This produces a [3 x 3] array. Because it is overwritten ineach iteration, this is not a pre-allocation, but a waste of time. It does not matter, that it is overwritten by a [1 x 3] array, but confusing only.

I assume, your code can be accelerated. What are the used sizes of the input?

The integration scheme based of delta(v) = delta(t)*a is very basic. There are integrations methods on a higher order, which allow to increase the step sizes, which reduces the run time.

[EDITED] Vectorizing the inner loop and using SINGLE for all data reduces the runtim from 1.64 sec to 0.06 sec for 2e3 objetcs:

function test

% Mass, position, speed

heavy = rand(2e3, 7) .* [1e9, 1e6, 1e6, 1e6, 1e3, 1e3, 1e3];

tic

heavy = Core1(heavy);

toc

heavyS = single(heavy)

tic

heavy = Core1(heavyS);

toc

end

function heavy = Core3(heavy)

timeStep = 0.1;

G = 6.67430e-11;

M = heavy(:, 1);

X = heavy(:, 2:4);

V = heavy(:, 5:7);

n = numel(M);

MG = M * G;

for i = 1:n

mi = M(i);

D = X - X(i, :);

r2 = sum(D .* D, 2);

F = (D ./ sqrt(r2)) .* (MG * mi ./ r2);

F(i, :) = 0;

Ft = sum(F, 1);

V(i, :) = V(i, :) + Ft * timeStep;

X(i, :) = X(i, :) + timeStep * V(i, :);

end

heavy(:, 2:4) = X;

heavy(:, 5:7) = V;

end

% Elapsed time is 1.880953 seconds. Original code

% Elapsed time is 0.140433 seconds. Vectorized, double

% Elapsed time is 0.066267 seconds. Vectorized, single

I've implemented this with the assumption that your "sign(mi)" was thought as "single(mi)".

But note this: In the current implementation the problem is O(n^2). Doubling the input size does increase the runtime by a factor 4. I've tested the code with 2e3 rows. For 1e6 rows this will be more than 4 hours per time step.

How many timesteps do you want to calculate? I assume, there is no way to perform this efficiently in Matlab with this simple apporach. Even if you convert this to a C-mex method and increase the speed by a factor 2 (a pure guess only!), 2 hours per time step is still far away from allowing a serious number of time steps.

Bjorn has mentioned, that the force matrix is symmetrical, but it is not trivial to exploit this for 1e6 elements without exhausting the the memory. Using a sophisticated intergator is essential to reduce the number of timesteps for a certain accuracy.

Professional simulations use e.g. the leap-frog algorithm. This simplifies the computation by combining a neighborhood of elements to their center of mass: If we simulate the trajectories of the stars in the galaxy, the distribution of other stars in a spatial segment far away is in first approximation the force of its center of mass containing the sum of the masses. This can reduce the computing time by a factor 100 without loosing too much accruracy.

##### 6 Comments Show 4 older commentsHide 4 older comments

Show 4 older commentsHide 4 older comments

Christos Dimopoulos on 24 Feb 2021

#### Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/749724-how-to-vectorise-or-speed-up-the-specific-code#comment_1352764

Edited: Christos Dimopoulos on 25 Feb 2021

Open in MATLAB Online

Thanks you so much for your time.

It seems that you helped a lot. I will mention the unsolved ones of your remarks.

I am trying a rather rare hypothesis of negative mass, so that's why I needed sign(m).

In your equation you have changed some parts that I tried to fix but (not understanding why) I get different results.

In the specific line you have:

F = (D ./ sqrt(r2)) .* (MG * mi ./ r2);

and I made it to match my equation:

F = (D ./ r2) .* (MG * sign(M) ./ (r2 .^ 2));

You made it force with two masses but I wanted acceleration with just one mass and only the sign of the other.

I compare the two functions and I get different results (both in double presicion of course).

The two functions are:

function heavy1 = Core3(heavy,timeStep,GxSolarMass)

heavy1 = heavy ;

M = heavy(:, 1);

X = heavy(:, 2:4);

V = heavy(:, 5:7);

n = numel(M);

%MG = M * G;

for i = 1:n

mi = M(i);

MG = mi * G;

D = X - X(i, :);

r2 = sqrt(sum(D .* D, 2));

F = (D ./ r2) .* (MG * sign(M) ./ (r2 .^ 2));

F(i, :) = 0;

Ft = sum(F, 1);

V(i, :) = V(i, :) + Ft * timeStep;

X(i, :) = X(i, :) + timeStep * V(i, :) ;

end

heavy1(:, 2:4) = X;

heavy1(:, 5:7) = V;

end

and:

function heavy1 = Initial(heavy,timeStep,G)

n = size(heavy,1);

heavy1 = heavy;

for i = 1:n

mi = heavy(i,1);

if mi ~= 0

xi = heavy(i,2);

yi = heavy(i,3);

zi = heavy(i,4);

ar = [0 0 0];

for j = 1:n

mj = heavy(j,1);

if i ~= j && mj ~= 0

xj = heavy(j,2);

yj = heavy(j,3);

zj = heavy(j,4);

d = [xj yj zj] - [xi yi zi];

dabs = sqrt(sum(d.^2));

ar = ar + (d/dabs)*G*mj*sign(mi)/(dabs^2);

end

end

for k = 1:3

heavy1(i,k+4) = heavy1(i,k+4) + ar(k)*timeStep;

heavy1(i,k+1) = heavy1(i,k+1) + timeStep*heavy1(i,k+4);

end

end

end

end

Thanks again for your time and help.

Bjorn Gustavsson on 25 Feb 2021

#### Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/749724-how-to-vectorise-or-speed-up-the-specific-code#comment_1353474

"You made it force with two masses but I wanted acceleration with just one mass and only the sign of the other."

That is a very "interesting" idea, but to my understanding not relevant to any physics as far as we know. Think about what that means for the acceleration of two bodies with different mass.

Christos Dimopoulos on 26 Feb 2021

#### Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/749724-how-to-vectorise-or-speed-up-the-specific-code#comment_1355734

You are correct. Mainstream physics keeps a distance from negative mass, but it is not yet proven. We are doing some research on different behaviours of matter in strange cases.

Bjorn Gustavsson on 26 Feb 2021

#### Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/749724-how-to-vectorise-or-speed-up-the-specific-code#comment_1355829

Do you also have any reason to change the behaviour of:

The way you put it it sounds like you'll get a very strange assymetry between the forces.

Christos Dimopoulos on 26 Feb 2021

#### Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/749724-how-to-vectorise-or-speed-up-the-specific-code#comment_1356984

If we take into consideration negative mass (whatever that means) then we have between same kind attraction and between different kind repulsion. It gives relative but antisymmetric results than EM field.

But if I understand correctly, we are discussing now physics and not matlab code. I see that you are space physics professor, if you have more interest in the results it will be my pleasure to talk in another discussion.

Bjorn Gustavsson on 26 Feb 2021

#### Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/749724-how-to-vectorise-or-speed-up-the-specific-code#comment_1357119

Yes then you have what essentially boils down to identical equations as in PIC plasma-simulations. Have a look at what people do in that field.

Sign in to comment.

Sign in to answer this question.

### See Also

### Categories

Mathematics and OptimizationPartial Differential Equation ToolboxDomain-Specific ModelingElectromagnetics

Find more on **Electromagnetics** in Help Center and File Exchange

### Tags

- vectorization
- speed

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

**An Error Occurred**

Unable to complete the action because of changes made to the page. Reload the page to see its updated state.

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list

Americas

- América Latina (Español)
- Canada (English)
- United States (English)

Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- Deutsch
- English
- Français

- United Kingdom(English)

Asia Pacific

- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)

Contact your local office