2020-04-21优化4 分钟读完 (大约 549 个字)

Cache friendly code

什么是缓存友好代码？学院派的回答大概会涉及CPU多级缓存结构等原理性的内容，但也许只要理解“缓存是CPU里一块比内存小很多，但也快很多的存储空间，访问内存时CPU会一段一段的把内存读取到缓存，如果后续操作的数据在缓存里就不会读内存，整体速度就快了”，这样的基本概念其实就可以开始编写缓存友好的代码了。

多核多线程时还会有False Sharing等问题，可以入门后再逐步了解。

下面是一段演示代码。对一个二维数组的数据求和，区别只是按“行”访问还是按“列”访问(会影响cache是否命中)，品出其中差别也就算入门了。

在数据集小到可以全部放在cache中的时候，是真的飞起了，当然现实中很少有这样理想的情况，缓存友好的核心也就是如何设计能高命中的数据结构和代码。

常见的做法会把对象属性以线性存储，并分独立函数更新。这里会有点和面向对象拧着的感觉，需要适应一下，对外接口OOP内部组织DOP。

Unity的DOTS中的ECS(Entity Component System )是不错的概念参考。

#include <iostream>
#include <chrono>
#include <string>
#include <vector>

using namespace std;

template <int M, int N>
int sumarrayrows(int a[M][N])
{
    int sum = 0;
    for (int i = 0; i != M; ++i)
        for (int j = 0; j != N; ++j)
            sum += a[i][j];
    return sum;
}

template <int M, int N>
int sumarraycols(int a[M][N])
{
    int sum = 0;
    for (int j = 0; j != N; ++j)
        for (int i = 0; i != M; ++i)
            sum += a[i][j];
    return sum;
}

template <int M, int N>
void testCase()
{
    cout << "testCase: " << M << "," << N << endl;
    int data[M][N] = {1};
    {
        auto start = chrono::steady_clock::now();
        int sum = 0;
        for (int loop = 0; loop < 10000; loop++)
        {
            sum += sumarrayrows<M, N>(data);
        }
        auto end = chrono::steady_clock::now();
        cout << "\tsum: " << sum << " time: " << chrono::duration_cast<chrono::microseconds>(end - start).count() << endl;
    }

    {
        auto start = chrono::steady_clock::now();
        int sum = 0;
        for (int loop = 0; loop < 10000; loop++)
        {
            sum += sumarraycols<M, N>(data);
        }
        auto end = chrono::steady_clock::now();
        cout << "\tsum: " << sum << " time: " << chrono::duration_cast<chrono::microseconds>(end - start).count() << endl;
    }
}

int main()
{
    testCase<10, 10>();
    testCase<100, 100>();
}

1	clang ./cacheTest.cpp --std=c++14 -lstdc++ -O2 -o test.out

testCase: 10,10
        sum: 10000 time: 5
        sum: 10000 time: 306
testCase: 100,100
        sum: 10000 time: 15380
        sum: 10000 time: 74331

Cache friendly code

分类

最新文章

归档

标签