FastFieldSolvers Forum

FastFieldSolvers Forum

All Forums

FastFieldSolvers

FasterCap and FastCap2

FasterCap not going out of core

Note: You must be registered in order to post a reply.
To register, click here. Registration is FREE!

Screensize:

UserName:

Password:

Antispam question:

What do MOONwalk and MOONdance have in common?

Answer:

Format Mode:

Format:

Message:

* HTML is OFF
* Forum Code is ON

[quote][i]Originally posted by Enrico[/i]
[br]
 [quote]When ever FasterCap creates a new discretization block it checks whether the n * (link memory) is greater than the free (RAM) memory.
If this condition is fulfilled FasterCap goes out of Core, if not it allocates RAM memory.[/quote]

Yes this is correct, but the check is not done when creating a new discretization block, but when estimating the memory required for the total of the links for a given discretization and interaction value set.

However you need to consider that with modern operating systems (i.e. all..), the memory management is fully in charge of the OS, that is usually leveraging also some swap space on the hard disk. So the sum of the memory required by all application / processes can be larger than the actual physical memory, as the OS virtualizes it and swaps to the hard disk the 'most unused' sections of memory. Unfortunately, for Out-of-core applications this is an issue, if you cannot instruct the OS to avoid the swap. You want to control that because the intended serialization of the memory out-of-core is much more efficient than the generic (even if intelligent) algorthms used by the OS to handle the memory allocation requests. So instead you end up getting closer to the physical memory limit BEFORE going Out-of-core (as the OS still accepts memory allocation requests and/or shows more free memory than it has), and you start using the swap, with a great slow-down. Only when the swap is not sufficient any more you go out-of-core, but this is double slow as the memory you are carrying out of core is actually on the hard disk swap file / partition. This is the ultimate reason why the condition to go out-of-core in FasterCap is the need of a *fraction* only of the overall free memory as reported by the OS. You may try to lower this threshold, i.e. increase the -f value. Usually a value of 5 will do, but it really depends. The cons of this approach is of course that you go out-of-core much earlier than needed, with an overall unneeded slow down. Note by the way that if you have SSD hard disk the penalty to go out-of-coure is much reduced, as access is faster.

[quote]"Direct potential interaction coefficient to mesh refinement ration -d". This parameter should control the "correlation" of how many
panels are considered when calculating the contribution of another panel.[/quote]

Yes, or said in other words, how many 'links' per panel are considered.

However, if even reducing the number of links and going out-of-core, you still get out of memory, the problem is possibly in an excessive number of panels altogether. The Out-of-core algorithm will serialize the links (that are usually linear with the number of panels, but through a multiplication coefficient, so they are predominant; if you cannot fit the links, for sure you cannot fit the panels in memory, while the vice-versa most of the times is possible). So if you end up with too many panels, I'm afraid that you cannot solve your problem for the current memory configuration you have.

Note however that if you select '-i' and '-v' options you should have more detailed information about what is happening, and the actual memory consumption. This should help you understanding where the bottleneck is.

Best Regards,
Enrico
[/quote]

Check here to include your profile signature.

T O P I C R E V I E W
chgad	Posted - May 03 2019 : 10:08:11 Hello everyone, I have encountered a setup where my machines available RAM (approx. 27 GB) isn't enough to solve the problem and at a certain point of FasterCaps discretization process my machine halts (loosing all info about previous results). This is the moment where i re-read the Docs of FasterCap and thought about modifying the "Out-of-Core free memory to link memory condition -f". But i'm not quite sure if I understand it correctly : Let the Value for -f be n. When ever FasterCap creates a new discretization block it checks whether the n * (link memory) is greater than the free (RAM) memory. If this condition is fulfilled FasterCap goes out of Core, if not it allocates RAM memory. Is this understanding correct ? If so i still encounter FasterCap causing my machine to halt and again losing all previous results. Furthermore i read in a Post in this Forum about modifying the "Direct potential interaction coefficient to mesh refinement ration -d". This parameter should control the "correlation" of how many panels are considered when calculating the contribution of another panel. Is this correct ? I'd really like to understand those options to get my models working. Processing time isn't really a problem right now and I'm well aware that going out of core will increase the time needed. Thanks for any advice and answers in advance.
9 L A T E S T R E P L I E S (Newest First)
Enrico	Posted - Jun 12 2019 : 11:15:34 I'm sorry, I probably need to phrase it differently. When I talk about bad scaling I don't mean the size of the problem, but the bad scaling of the geometry, e.g. having large structures that in some points are very close to each other, for instance a 100 micron by 100 micron plane with a distance of 10 nanometers to another structure. Another example may be very small features leading to very small triangles in the input file. In most of the cases however the issues in GMRES convergence are caused by overlapping or intersecting planes, this is why I mentioned that. Best Regards, Enrico
chgad	Posted - Jun 11 2019 : 19:02:15 I'm always using Jacobi Preconditioner. In terms of scaling we are talking about a structure which has approximate dimensions of 0.11 micron x 260 micron x 139 micron. I'm not sure if this qualifies as "really really badly scaled" as you mentioned.
Enrico	Posted - Jun 11 2019 : 18:37:52 If your GMRES iterations go up to 400 there's something wrong unless your geometry is really really badly scaled in nature. Pls try to use a Jacobi preconditioner and see what happens; and also be sure there are no overlapping or intersecting panels in your input geometry. Best Regards, Enrico
chgad	Posted - Jun 11 2019 : 16:18:25 After the initial calculation the run takes approx. 400 GMRES Iterations. (~2 hours of computation). The Run after the next refinement takes more than 800 GMRES Iterations (where my machine halts). So i guess the panels don't fit in the RAM ? Nevertheless this gave me an idea how to at least somehow solve my case. It's to simple to be honest... I could try to simulate half of the structure which should still be large enough that the border terms do not interfere that much. With that I can simply compute the whole structure as a parallel circuit of two of those half structures. One thing which grinds my gears: Is there a way to quantitatively determine whether or not my structure is actually to big ?
Enrico	Posted - Jun 11 2019 : 15:20:53 FasterCap will go OOO only for storing the mutual interactions (that is the most memory intensive part - it is the compressed interaction matrix), but it will not store the panel structures OOO. So in the end if you have too many panels, you will hit the physical memory limit; but the assumption here is that if you need so much memory for the panels, you would take a huge amount of memory OOO making the whole solution unpractical in a reasonable timeframe. However, if you have a large number of GMRES iterations, this is a flag that something is possibly wrong or not optimized in the input geometry. If instead your limit is in the automatic refinement increase (the 'iteration #n' and Frobenious norm of the difference), you should try to drive the simulation manually. One obvious reason for doing that in large simulations is that to verify convergence the software needs two consecutive simulations that in the end give almost the same results. So you could have saved the last one. If you perform a kind of 'calibration' with a simpler structure, but similar geometry / geometrical ratios, then you can simply perform one single simulation with the -m and -d parameters you tuned. Best Regards, Enrico
chgad	Posted - Jun 11 2019 : 10:20:40 Let me try to reformulate, maybe my understanding of the out-of-core mechanism is still not correct. When running the simulation with -f=10 and -f=100 i recognized that it progressed way further in the second run. This led me to the assumption that in the second case way more of the Disk memory was used which is reasonable. So by saying "going out-of-core completely" I mean that at a certain point I would expect FasterCap to ONLY use the Disk memory, and spare some RAM so the machine does not halt. On the contrary, until now, it seems that only some (yet many) intermediate steps are using the Disk memory. My guess was that by increasing the value for -f I will eventually enforce this behavior.
Enrico	Posted - Jun 05 2019 : 12:36:42 quote: Originally posted by chgad [quote]still did not go out-of-core completely. While your problem could indeed be too big, I'm not sure I fully understand your sentence here. What do you mean 'did not go out-of-core completely'? Thanks Enrico
chgad	Posted - Jun 05 2019 : 10:41:02 quote: However you need to consider that with modern operating systems (i.e. all..), the memory management is fully in charge of the OS, that is usually leveraging also some swap space on the hard disk. I now manually disabled the swap-space on my machine and tried -f = 10 and 100. FasterCap indeed progressed further than with -f=5 but still did not go out-of-core completely. I'll try out -f=1000 next week but I'm not really optimistic about it and slowly thinking my Problem (250 mu x 140 mu) is actually too big for FasterCap.
Enrico	Posted - May 06 2019 : 13:17:19 quote: When ever FasterCap creates a new discretization block it checks whether the n * (link memory) is greater than the free (RAM) memory. If this condition is fulfilled FasterCap goes out of Core, if not it allocates RAM memory. Yes this is correct, but the check is not done when creating a new discretization block, but when estimating the memory required for the total of the links for a given discretization and interaction value set. However you need to consider that with modern operating systems (i.e. all..), the memory management is fully in charge of the OS, that is usually leveraging also some swap space on the hard disk. So the sum of the memory required by all application / processes can be larger than the actual physical memory, as the OS virtualizes it and swaps to the hard disk the 'most unused' sections of memory. Unfortunately, for Out-of-core applications this is an issue, if you cannot instruct the OS to avoid the swap. You want to control that because the intended serialization of the memory out-of-core is much more efficient than the generic (even if intelligent) algorthms used by the OS to handle the memory allocation requests. So instead you end up getting closer to the physical memory limit BEFORE going Out-of-core (as the OS still accepts memory allocation requests and/or shows more free memory than it has), and you start using the swap, with a great slow-down. Only when the swap is not sufficient any more you go out-of-core, but this is double slow as the memory you are carrying out of core is actually on the hard disk swap file / partition. This is the ultimate reason why the condition to go out-of-core in FasterCap is the need of a fraction only of the overall free memory as reported by the OS. You may try to lower this threshold, i.e. increase the -f value. Usually a value of 5 will do, but it really depends. The cons of this approach is of course that you go out-of-core much earlier than needed, with an overall unneeded slow down. Note by the way that if you have SSD hard disk the penalty to go out-of-coure is much reduced, as access is faster. quote: "Direct potential interaction coefficient to mesh refinement ration -d". This parameter should control the "correlation" of how many panels are considered when calculating the contribution of another panel. Yes, or said in other words, how many 'links' per panel are considered. However, if even reducing the number of links and going out-of-core, you still get out of memory, the problem is possibly in an excessive number of panels altogether. The Out-of-core algorithm will serialize the links (that are usually linear with the number of panels, but through a multiplication coefficient, so they are predominant; if you cannot fit the links, for sure you cannot fit the panels in memory, while the vice-versa most of the times is possible). So if you end up with too many panels, I'm afraid that you cannot solve your problem for the current memory configuration you have. Note however that if you select '-i' and '-v' options you should have more detailed information about what is happening, and the actual memory consumption. This should help you understanding where the bottleneck is. Best Regards, Enrico

FastFieldSolvers Forum