Skip to content

Concurrent logging w/cl-syslog results in memory corruption #92

@appleby

Description

@appleby

While attempting to run the pyquil test suite in parallel via the pytest-xdist plugin, I noticed occasional "Unhandled memory fault" errors like the following.

Based on the error message, this looked similar to quil-lang/qvm#110, so I tried disabling logging in quilc and, sure enough, the errors disappeared.

pyquil/tests/test_operator_estimation.py:1303:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pyquil/operator_estimation.py:938: in measure_observables
    calibr_results, d_calibr_qub_idx = _exhaustive_symmetrization(qc, qubs_calibr, calibr_shots, calibr_prog)
pyquil/operator_estimation.py:1083: in _exhaustive_symmetrization
    total_prog_symm_native = qc.compiler.quil_to_native_quil(total_prog_symm)
pyquil/api/_error_reporting.py:238: in wrapper
    val = func(*args, **kwargs)
pyquil/api/_compiler.py:340: in quil_to_native_quil
    response = self.client.call('quil_to_native_quil', request, protoquil=protoquil).asdict()  # type: Dict
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <rpcq._client.Client object at 0x1235b57f0>, method_name = 'quil_to_native_quil', rpc_timeout = 10
args = (NativeQuilRequest(_type='NativeQuilRequest', quil='PRAGMA READOUT-POVM 0 "(0.95 0.18000000000000005 0.050000000000000...]\nMEASURE 0 ro[0]\n', target_device=TargetDevice(_type='TargetDevice', isa={'1Q': {'0': {}}, '2Q': {}}, specs=None)),)
kwargs = {'protoquil': None}
request = RPCRequest(_type='RPCRequest', id='bc69a5f1-57e2-47f5-96af-0eab678bebc4', jsonrpc='2.0', method='quil_to_native_quil',...nMEASURE 0 ro[0]\n', target_device=TargetDevice(_type='TargetDevice', isa={'1Q': {'0': {}}, '2Q': {}}, specs=None)),)})
start_time = 1567443910.04178, timeout = 9999.999046325684
raw_reply = b'\x85\xa7jsonrpc\xa32.0\xa5error\xbeUnhandled memory fault at #x0.\xa2id\xda\x00$bc69a5f1-57e2-47f5-96af-0eab678bebc4\xa8warnings\x90\xa5_type\xa8RPCError'
reply = RPCError(error='Unhandled memory fault at #x0.', id='bc69a5f1-57e2-47f5-96af-0eab678bebc4', jsonrpc='2.0', warnings=[])

The work-around in qvm-app was to add a WITH-LOCKED-LOG macro and use it to acquire a global lock around any locking calls.

In the case of RPCQ, it's not so simple since the logger instance is passed in by the caller of RPCQ:START-SEVER.

Ideally, this would be resolved in CL-SYSLOG, if possible, but we might want to implement a workaround in QUILC/RPCQ in case that turns out to be impossible / impractical / slow to get merged.

Here is a minimal-ish testcase that reproduces the issue.

(ql:quickload :rpcq)

(defun test-method ()
  "hey")

(let* ((number-of-workers 4)
       (addr (format nil "inproc://~a" (uuid:make-v4-uuid)))
       (server-function
         (lambda ()
           (let ((dt (rpcq:make-dispatch-table)))
             (rpcq:dispatch-table-add-handler dt 'test-method)
             (rpcq:start-server :dispatch-table dt
                                :listen-addresses (list addr)
                                :logger (make-instance 'cl-syslog:rfc5424-logger
                                                       :app-name "logtest"
                                                       :facility ':local0
                                                       :maximum-priority ':debug
                                                       :log-writer
                                                       #-windows (cl-syslog:tee-to-stream
                                                                  (cl-syslog:syslog-log-writer "logtest" :local0)
                                                                  *error-output*))))))
       (server-thread (bt:make-thread server-function)))
  (sleep 1)
  (let ((threads '()))
    (unwind-protect
         (loop :repeat number-of-workers :do
           (push (bt:make-thread (lambda ()
                                   (loop :repeat 20 :do
                                     (rpcq:with-rpc-client (client addr)
                                       (rpcq:rpc-call client "test-method")))))
                 threads))
      (progn
        (dolist (thread threads)
          (bt:join-thread thread))
        (bt:destroy-thread server-thread)))))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions