Suggestions for debugging double free error?
回答済みHi,
I've implemented a branch-and-cut algorithm in C++ with Gurobi that adds several families of cuts. In some instances, whenever I add one of these families of cuts (call it Cut X), the program crashes with a “double free or corruption” error. I tried to debug, but I cannot visualize the full stack-trace because Gurobi's code is private (the only information the debugger gives me is that it crashes while calling “model.optimize()”).
Code seems to run fine when I disable Cut X. Also, I have multiple asserts in my code to check if I'm not adding a non-existing variable when adding a lazy constraint in the callback. The entire “callback()” function is in a try-catch block and it does not seems to crash while calling my separation routines. I also tried printing the added Cut X's and eyeballing and I could not detect anything too weird. Also, I used a debug solution and “manually” checked that it satisfies every added Cut X.
I would appreciate any suggestions on how I may try to debug this issue.
Thanks,
Matheus
……………..
I also tried using valgrind, but it points to some lines in Gurobi's library, so it is hard to get some useful info. Here is an example of the valgrind output:
==58207== Conditional jump or move depends on uninitialised value(s)
==58207== at 0x5129005: PRIVATE00000000009214ae (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x50A7848: PRIVATE000000000087facd (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x4EA57CC: PRIVATE000000000059ba62 (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x4EA1180: PRIVATE00000000005990e3 (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x4EA0E62: PRIVATE0000000000595d16 (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x515DD35: PRIVATE0000000000ba3b9d (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x515BEBE: PRIVATE000000000094f45c (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x515B33F: GRBoptimize (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x220990: GRBModel::optimize() (in in /home/matheus/project/build/main)
==58207== by 0x20080B: NodeModel::solve(Solution&) (nodemodel.cpp:147)
==58207== by 0x128C74: main (main.cpp:43)
==58207== Uninitialised value was created by a heap allocation
==58207== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==58207== by 0x527221A: PRIVATE0000000000a86525 (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x51293D6: PRIVATE00000000009214ae (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x50A7848: PRIVATE000000000087facd (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x4EA57CC: PRIVATE000000000059ba62 (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x4EA1180: PRIVATE00000000005990e3 (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x4EA0E62: PRIVATE0000000000595d16 (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x515DD35: PRIVATE0000000000ba3b9d (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x515BEBE: PRIVATE000000000094f45c (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x515B33F: GRBoptimize (in /home/matheus/Programs/gurobi1200/linux64/lib/libgurobi.so.12.0.0)
==58207== by 0x220990: GRBModel::optimize() (in /home/matheus/project/build/main)
==58207== by 0x20080B: NodeModel::solve(Solution&) (nodemodel.cpp:147)
-
Hi Matheus,
Could you write the model file and the cut that results in a crash when adding it to the model, and try to reproduce this issue with a script that reads the model file and adds this single cut to the model?
If this also results in a crash, please send us the script, and we can investigate the issue on our side.Best regards,
Marika0 -
Hi Marika,
It is not a single cut that triggers the issue. Only after Gurobi is running for a while that the issue happens (so multiple cuts get added and of different types). It is hard to isolate the issue because it seems it is an Undefined Behavior issue, so in many cases some apparently unrelated changes can make the issue somewhat randomly disappear. In any case, I managed to get an instance that crashes in a somewhat clean way. Here is my callback() code (omitting the code that add the cuts).
void MyCallback::callback() { // Check a data structure. for (int i = 0; i < MyOldCutsCMP->Size; i++) { if (MyOldCutsCMP->CPL[i]->CType == CMGR_CT_CAP) { for (int j = 1; j <= MyOldCutsCMP->CPL[i]->IntListSize; j++) { int id = MyOldCutsCMP->CPL[i]->IntList[j]; // Ids in the cut should be between 1 and n. assert(id >= 1 && id < instance.n); } } } } std::cout << "where: " << where << std::endl; try { bool isMIPSol = (where == GRB_CB_MIPSOL); bool isMIPNode = (where == GRB_CB_MIPNODE) && (getIntInfo(GRB_CB_MIPNODE_STATUS) == GRB_OPTIMAL); if (!isMIPSol && !isMIPNode) { return; } (*) (CODE THAT ADD CUSTOM CUTS HERE) } catch (GRBException e) { std::cout << "Callback error code: " << e.getErrorCode() << std::endl; std::cout << e.getMessage() << std::endl; exit(1); } catch (...) { std::cout << "Unknown error!" << std::endl; exit(1); } }The only place that `MyOldCutsCMP` is touched is at the part (*) of the code (that add the custom cuts). If I run my code in one of my instances (the issue happens somewhat rarely), the code crashes right after printing a bunch of “where 0”. Since when `where == 0` the callback returns without adding the custom cuts, I'm suspecting that something on Gurobi side is touching the memory used by `MyOldCutsCMP`.
My current version of Gurobi is 12.0.0, could this be the issue?
Thanks,
Matheus
0 -
Within the POLLING callback, it is not possible to access any model or optimization data.
Could you please check the where-arguments (MIPSOL, MIPNODE) directly at the beginning of the callback before your for loop to check the data structure? Does this also crash?0 -
I had the same error again. Here is the new callback code. Just to be clear, I believe the checking loop do not use any Gurobi data. The separation routine does, but it is called after all the checking.
void MyCallback::callback() { std::cout << "where: " << where << std::endl; bool isMIPSol = (where == GRB_CB_MIPSOL); bool isMIPNode = (where == GRB_CB_MIPNODE) && (getIntInfo(GRB_CB_MIPNODE_STATUS) == GRB_OPTIMAL); // Check a data structure. // (No Gurobi model data is accessed here, the cut data from previous iterations // of the separation routine was stored here as standard C int and doubles.) std::cout << "before checking" << std::endl; for (int i = 0; i < MyOldCutsCMP->Size; i++) { if (MyOldCutsCMP->CPL[i]->CType == CMGR_CT_CAP) { for (int j = 1; j <= MyOldCutsCMP->CPL[i]->IntListSize; j++) { int id = MyOldCutsCMP->CPL[i]->IntList[j]; // Ids in the cut should be between 1 and n. assert(id >= 1 && id < instance.n); } } } } std::cout << "after checking" << std::endl; if (!isMIPSol && !isMIPNode) { return; } try { (*) (CODE THAT ACCESS MODEL DATA AND ADD CUSTOM CUTS HERE) } catch (GRBException e) { std::cout << "Callback error code: " << e.getErrorCode() << std::endl; std::cout << e.getMessage() << std::endl; exit(1); } catch (...) { std::cout << "Unknown error!" << std::endl; exit(1); } }Here is part of the output that I get from this code.
where: 0 before checking after checking where: 0 before checking after checking where: 0 before checking after checking where: 0 before checking after checking where: 0 before checking => Assertion fails!So it seems that even though Gurobi just called the callback with `where = POLLING` in the last iterations, it still somehow modified the memory used by `
MyOldCutsCMP`.Thanks,
Matheus
0 -
So, I understand that the crash only happens when you check your data structure during the POLLING callback. You do not see a crash if you move your return statement before the check. Is this correct?
We need a reproducible example to understand what exactly happens when the program crashes. Would you be willing to share (minimal) reproducible code?
0 -
Hi Marika,
In that particular instance, if the
returnhappens before the check, the code crashes later during cut separation. But in most crashing instances, GDB was pointing out that the code was crashing in the line that callsmodel.optimize().It’s a bit tricky to isolate things into a small reproducible example because there are too many components… Still, I think I managed to fix the issue—the error hasn’t shown up again, and I’ve run over 1000 instances without problems.
Just in case it helps someone in the future, here’s what I changed:
- In my implementation, I first create the variables as continuous, solve my own LP cutting-plane loop, change the variables to integer, and then use Gurobi as a MIP solver. In my LP cutting-plane loop, I was adding cuts as lazy constraints (by calling
addedConstraint.set(GRB_IntAttr_Lazy, 3)), so that these cuts are treated as lazy later when solving the MIP. I’ve now switched to adding them as regular constraints (so I guess they are now treated as regular “original model constraints” when solving the MIP). - I also updated Gurobi to version 12.0.2.
Thanks in any case,
Matheus
0 - In my implementation, I first create the variables as continuous, solve my own LP cutting-plane loop, change the variables to integer, and then use Gurobi as a MIP solver. In my LP cutting-plane loop, I was adding cuts as lazy constraints (by calling
サインインしてコメントを残してください。
コメント
6件のコメント