Skip to content

Do some science about perf impacts of local vs referenced variables #25

@ounsworth

Description

@ounsworth

Here are two implementations of the same function, one that does the computation in a local copy of the data, and one that acts in-place in the provided ref.

While playing with benchmarks for several of the library's algorithms, I started getting a hunch that the in_place version of functions like these have faster performance. Maybe because there's a cache miss penalty to making a lot of accesses to memory somewhere else, or maybe because when you make a local copy, the compiler knows that it doesn't need to be thread-safe, so is free to do all the computation in CPU cache and only write the final result to RAM instead of needing to wait for RAM IO on every operation. Or something. But I wasn't able to prove it to myself conclusively. So this task is to do some science -- rigorous benchmarking or analysis of compiled assembly -- to get a conclusive answer.

fn do_something(arr: &[u8; 1000]) -> [u8; 1000] {
    let mut local_arr = arr.clone();
    
    // some random expensive operations
    for i in 1 .. 999 {
        for j in i .. 999 {
            local_arr[j] = local_arr[j-1] ^ local_arr[j] ^ local_arr[j+1];
        }
    }
    
    local_arr
}

fn do_something_in_place(arr: &mut [u8; 1000]) {
    // some random expensive operations
    for i in 1 .. 999 {
        for j in i .. 999 {
            arr[j] = arr[j-1] ^ arr[j] ^ arr[j+1];
        }
    }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedCould be picked up by anyone in the communityresearchThis involves some open-ended research

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions