Cannot pickle multiprocess examples
Vitals
$ hostnamectl
Static hostname: schooner3.oscer.ou.edu
Icon name: computer-server
Chassis: server
Machine ID: b85f324c55fb4829bdcf5207b50cf24e
Boot ID: cf11b7300a094dd389527d8d104078d6
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-1127.el7.x86_64
Architecture: x86-64
$ module list
No modules loaded
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 40
On-line CPU(s) list: 0-39
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz
Stepping: 2
CPU MHz: 1283.081
CPU max MHz: 3000.0000
CPU min MHz: 1200.0000
BogoMIPS: 4594.90
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 25600K
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb invpcid_single ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts md_clear spec_ctrl intel_stibp flush_l1d
Observation
When attempting to run the Sodankyla example, the default option which uses 2 processes fails on my instance.
$ python3 run.py
Example data are already available on disk
2024-04-04 16:25:29,570, INFO: ++++ Welcome to PROFFASTpylot ++++
2024-04-04 16:25:29,605, INFO: Run information:
Retrieval for Instrument SN039 at Sodankyla with time offset 0.0.
The following dates will be processed:
2017-06-08, 2017-06-09.
2024-04-04 16:25:30,011, INFO: Running preprocess with 2 task(s) ...
2024-04-04 16:25:30,059, INFO: Removing temporary files ...
2024-04-04 16:25:30,063, INFO: Done.
Traceback (most recent call last):
File "run.py", line 26, in <module>
MyPylot.run(n_processes=2)
File "/home/USER/PROFFAST/proffastpylot/prfpylot/pylot.py", line 60, in run
self.run_preprocess(n_processes=n_processes)
File "/home/USER/PROFFAST/proffastpylot/prfpylot/pylot.py", line 111, in run_preprocess
output = pool.map(subs_method, all_inputfiles)
File "/usr/lib64/python3.6/multiprocessing/pool.py", line 266, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/lib64/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
File "/usr/lib64/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks
put(task)
File "/usr/lib64/python3.6/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/usr/lib64/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
TypeError: can't pickle _thread.RLock objects
By modifying the file run.py
to use 1 process instead of 2, I get more correct behavior.
$ python3 run.py
Example data are already available on disk
2024-04-04 16:42:51,987, INFO: ++++ Welcome to PROFFASTpylot ++++
2024-04-04 16:42:52,014, INFO: Run information:
Retrieval for Instrument SN039 at Sodankyla with time offset 0.0.
The following dates will be processed:
2017-06-08, 2017-06-09.
2024-04-04 16:42:52,263, WARNING: The analysis folder /home/USER/PROFFAST/proffastpylot/example/analysis/Sodankyla_SN039 exists already! The content may be overwritten.
2024-04-04 16:42:52,264, WARNING: The result directory /home/USER/PROFFAST/proffastpylot/example/results/Sodankyla_SN039_170608-170609 exists already! Renamed existing one to /home/USER/PROFFAST/proffastpylot/example/results/Sodankyla_SN039_170608-170609_backup0 and created a new one.
2024-04-04 16:42:52,291, INFO: Running preprocess with 1 task(s) ...
2024-04-04 16:43:10,474, INFO: Finished preprocessing.
2024-04-04 16:43:10,475, INFO: Running pcxs with 1 task(s) ...
2024-04-04 16:49:38,089, INFO: Finished pcxs.
2024-04-04 16:49:38,090, INFO: Running invers with 1 task(s) ...
2024-04-04 16:49:48,683, INFO: Finished invers.
2024-04-04 16:49:48,816, INFO: The combined results of PROFFAST were written to /home/USER/PROFFAST/proffastpylot/example/results/Sodankyla_SN039_170608-170609/comb_invparms_Sodankyla_SN039_170608-170609.csv.
2024-04-04 16:49:48,816, INFO: Removing temporary files ...
2024-04-04 16:49:48,942, INFO: Done.
Expected Behavior
Ideally, everything would work properly for multiple processes on a supercomputer for a Fortran script.
Other notes
Is this issue on our side or the PROFFAST side? What information would you need about the system configuration to determine if this is due to how our computer/instance/operating system is configured?
==================================
EDIT
This appears to be a problem with Python versioning. The original example was attempted with Python 3.6, but that apparently is incompatible with thread locks and pickling. See https://github.com/microsoft/qlib/issues/391. Instead, here is a workaround:
Workaround
Note that this particular workaround is machine specific. What is important is to upgrade your Python version.
$ python3 --version
Python 3.6.8
$ module load Python/3.11.5-GCCcore-13.2.0
$ python3 --version
Python 3.11.5
$ cd /path/to/proffastpylot
$ pip3 install --editable .
This change needs to occur BEFORE the pip package is installed.
Suggested Action
The PROFFASTpylot script should either prevent users from installing the pip package with a Python version <3.8, throwing an error, or it should incorporate some backwards compatibility. Does a from future
exist for pickling and threading?