Hello! In NPKit, we have a thread NpKit::CpuTimestampUpdateThread() looping to update the cpu timetamp and the updated value is given to a pointer cpu_timestamp_. To synchronize between CPU and GPU, NPKit has CPU SYNC event and GPU SYNC event happening at almost the same time and documents the values read from the pointer cpu_timestamp_ and clock64().
However, from my experiments, I think the cpu timestamp obtained in CPU SYNC event is not the correct value because cache coherence in the system may not be strong enough to ensure every update in the NpKit::CpuTimestampUpdateThread() writes to the memory and we may not get the most up-to-date value in CPU SYNC event even if we always use volatile in the code. Could I ask whether your team have noticed the problem and do you have any way to settle it? Thanks a lot!
Hello! In NPKit, we have a thread
NpKit::CpuTimestampUpdateThread()looping to update the cpu timetamp and the updated value is given to a pointercpu_timestamp_. To synchronize between CPU and GPU, NPKit has CPU SYNC event and GPU SYNC event happening at almost the same time and documents the values read from the pointercpu_timestamp_andclock64().However, from my experiments, I think the cpu timestamp obtained in CPU SYNC event is not the correct value because cache coherence in the system may not be strong enough to ensure every update in the
NpKit::CpuTimestampUpdateThread()writes to the memory and we may not get the most up-to-date value in CPU SYNC event even if we always usevolatilein the code. Could I ask whether your team have noticed the problem and do you have any way to settle it? Thanks a lot!