Force use of fork in multiprocessing
From Tomasz Balawajder: "Since we are using a Java service to launch the Python process, its behavior differs from running the script directly on the cluster. By default, Dask uses fork() to create worker processes. However, when running under the JVM, the start method defaults to spawn, which does not share memory between processes. This caused the slowdown and unexpected behavior. I’ve forced Python to use fork() in the configuration, and now the application completes in the same time as when executed with sbatch."
This commit is contained in:
@@ -69,6 +69,7 @@ def main(catalog_file, mc_file, pdf_file, m_file, m_select, mag_label, mc, m_max
|
|||||||
from matplotlib.contour import ContourSet
|
from matplotlib.contour import ContourSet
|
||||||
import xml.etree.ElementTree as ET
|
import xml.etree.ElementTree as ET
|
||||||
import json
|
import json
|
||||||
|
import multiprocessing as mp
|
||||||
|
|
||||||
logger = getDefaultLogger('igfash')
|
logger = getDefaultLogger('igfash')
|
||||||
|
|
||||||
@@ -448,9 +449,10 @@ verbose: {verbose}")
|
|||||||
|
|
||||||
start = timer()
|
start = timer()
|
||||||
|
|
||||||
use_pp = False
|
use_pp = True
|
||||||
|
|
||||||
if use_pp: # use dask parallel computing
|
if use_pp: # use dask parallel computing
|
||||||
|
mp.set_start_method("fork", force=True)
|
||||||
pbar = ProgressBar()
|
pbar = ProgressBar()
|
||||||
pbar.register()
|
pbar.register()
|
||||||
iter = indices
|
iter = indices
|
||||||
|
Reference in New Issue
Block a user