Skip to main content

Solve speed difference between Linux, Windows, and WSL2

Ongoing

Comments

3 comments

  • Jaromił Najman
    Gurobi Staff Gurobi Staff

    Hi Timo,

    This is an interesting observation.

    Let me try to explain what might be happening here.

    You can see that on WSL2 and Linux, the path taken by Gurobi is the same (or at least seems to be). This can be seen by the exact same number of explored node, simplex iterations, and work units. Here, native Linux is faster which is expected because WSL2 is "just" a Windows-Subsystem and may not be as performant as native Linux.

    Now, let's compare Windows to Linux in your tests. For case bnatt400 the work units / second ratio for Windows is 725.7/427.84 ~ 1.7  work units per second. For WSL2 we have 358.05/270.33 ~1.3. For Linux we have 358.05/187.45 ~1.9. So despite that Windows took more seconds to solve the model, the solution speed, i.e., work units per seconds seems comparable to Linux. The difference in seconds needed to solve the model can be explained by the fact that on Windows a different solution path has been taken. The different path can be caused by using a different operating system due to slightly different hardware usage of different OS.

    For case cost266-UUE, we have 0.6 work units per second on Windows, 1.1 for WSL2, and 1.5 for Linux. This definitely sounds surprising, but still may happen for single instances (outliers). It may happen that the different path chosen on Windows for the cost266-UEE case causes this drop in work units per second. However, without a greater overall benchmark, it is very hard to tell.

    You say that you tested all 3 operating systems on the same machine. Are you sure that Windows was able to use all its resources and that no background processes were running? This may be quite difficult to achieve on Windows without particular permissions.

    I hope this helps.

    Best regards, 
    Jaromił

    0
  • Timo Vermeer
    Gurobi-versary
    First Comment
    First Question

    > Are you sure that Windows was able to use all its resources and that no background processes were running? This may be quite difficult to achieve on Windows without particular permissions.

    I mean, WSL2 is running with the same background processes running, so surely that can't explain the difference between 0.6 WU/sec and 1.1 WU/sec. There are more background processes running for sure on Windows, but nothing particularly resource intensive. And with 8 cores, and MIP solving not scaling perfectly, some of those cores are typically unused (i.e. free to use by the background processes).

     

    > For case cost266-UUE, we have 0.6 work units per second on Windows, 1.1 for WSL2, and 1.5 for Linux. This definitely sounds surprising, but still may happen for single instances (outliers). It may happen that the different path chosen on Windows for the cost266-UEE case causes this drop in work units per second. However, without a greater overall benchmark, it is very hard to tell.

     If it is indeed outliers, I would expect Windows to be sometimes _faster_ than Linux as well. I'll see if I can take a look at a bigger data set, i.e. more MIPLIP2017 cases (but skipping the timeout ones in https://plato.asu.edu/ftp/milp_tables/8threads.res ).

     

    0
  • Timo Vermeer
    Gurobi-versary
    First Comment
    First Question

    Back with more testing. I got a new PC with a Ryzen 7700X that I freshly installed with Windows 11. Then performed tests on Linux using the docker image running on a Ubuntu 23.04 Live USB. And lastly I installed WSL2 in Windows to run the docker in WSL2 tests there.

    As mentioned above, I ran 227 cases in the MIPLIP2017 benchmark set, skipping the ones with timeouts to save a bit of time. Every case I ran with 4 different seeds (default, 10, 100, and 1000). 

    Summary:

    Time:

    - Linux is about 25% faster than Windows on average; median is about 30% faster.
    - WSL2 is about 5% slower than Windows on average, median is about 2% slower.

    WU/sec:

    - Linux is about 70% faster than Windows on average; median is about 40% faster.
    - WSL2 is about 20% faster than Windows on average; median is about 1-2% slower.

    Pictures:

    I have all the log files, but since pictures are easier to understand (and also I cannot upload zip files). Vertical axes is performance, horizontal axis in average run time of the particular problem on Windows. Note that the latter is log scale.

    And just to show that the docker runs are deterministic:

    Some interesting unrelated findings:

    - Case gmu-35-50 is _a lot_ slower on Linux/Docker compared to Windows. Minimum 4.5 times the work units, max was 50 times.
    - Conversly, gfd-schedulen180f7d50m30k18 is a lot faster on Linux. Minimum about the same (1.03 times the work units) because then Windows is _also_ slow, but 3 runs on Linux are 600 times slower than on Windows.

    Questions I have:

    This is going very much into detail on software, compilation, scheduling, etc, but:

    - I am expecting that Gurobi would spend most of its time in hand-written assembly routines (or at the very least intrinsics). I can try and figure that out myself, but is this not the case?
    - If this _is_ the case, where else could the performance difference come from? 25% performance (time) on average is a lot, surely the scheduler on Windows can't be that bad?

    I don't really expect answers on these, but hopefully there's someone at Gurobi that likes performance/profiling/assembly/schedulers as much as I do :)

    0

Please sign in to leave a comment.