Cached memory and loss of data synchronization between threads.

Cached memory and loss of data synchronization between threads.

Cached memory is understood as a cache area located close to the core, the CPU usually will not directly access RAM but will load ram into cached and use the data in cached (Figure 1). Periodically or under a certain condition, the CPU will synchronize these two memory areas with each other. This process is transparent to the programmer. Have you ever thought that you would have to deal with cached memory errors in your project?
My program has 2 threads, 1 thread gets data from an external sensor and pushes it to the queue (Figure 2), another thread gets data from the queue and processes it (Figure 3). However, there is a problem that sometimes thread B does not receive data, or the data read from thread B’s queue is garbage data, although the sensor still pushes the data up normally.
If I put the log in thread B to print out the data it receives from the queue, the bug will naturally disappear. Without understanding how cached works, it will take a long time to fix this bug.
Due to the scheduling mechanism, thread A and B will be distributed to different cores (core 1 and 2) to optimize system resources (Figure 4). And due to the cached mechanism, the data memory area located in the queue also has 3 instances, of which the main one is on ram, 2 sub instances are on the cached of cores 1 and 2. Core each core will manipulate the data on the copy. instance, and the synchronization between the sub-instances and the main one will be decided by the operating system.
At this point, we know that this error is caused by a loss of synchronization between the data instances in the system. If we define data as volatile, will this bug be fixed. For example, declare data as follows: char volatile data[1024] for example.
Declaring data as volatile memory will cause the CPU to directly access RAM, from which there is only one instance of Data left in the system. However, the bug is still not fixed. The reason is that every time you change the data of a memory word of the data, the cpu still has to load that word into its register. For example, the data[1]– statement will be translated into 3 assembly statements as shown in Figure 5. Since 2 threads run at 2 cores, there is a probability that both cores will run the assembly code at the same time, resulting As a result, the value of data[1] on ram is only incremented by 1 instead of by 2.
To handle this problem, we must use mutex. Like Figure 6. Using mutex will make it impossible for 2 cores to run the assembly code to change the value of variable a[1] in Figure 5. I tried to apply and found the bug has been fixed.
However, does anyone see an unreasonable point here? Using mutex only makes it impossible for cores to simultaneously change the value of a variable located in ram. But there is no guarantee that the data will not be cached on the cores. So why does the mutex fix this bug? Because if it is still cached, this bug will definitely not be fixed?
The answer is the 2nd property of the mutex – memory barier. All the code behind the mutex lock function has the volatile property, so the core will access the ram directly and ignore the cached. Everyone can refer to the link I sent in the comments.

  • Share


Leave a Reply

Your email address will not be published.