Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintain gotos when present in the original C code #186

Open
frabert opened this issue Oct 19, 2021 · 3 comments
Open

Maintain gotos when present in the original C code #186

frabert opened this issue Oct 19, 2021 · 3 comments
Labels
decomp Related to LLVM IR to C decompiler enhancement New feature or request user-story

Comments

@frabert
Copy link
Collaborator

frabert commented Oct 19, 2021

Currently, Rellic always avoids generating gotos, favoring ifs and whiles. This is not always favorable, e.g.

extern int doA(void);
extern void undoA();

extern int doB(void);
extern void undoB();

extern int doC(void);
extern void undoC();

extern void foo(void);

int main(void) {
  if(!doA()) {
    goto exit;
  }
  if(!doB()) {
    goto cleanupA;
  }
  if(!doC()) {
    goto cleanupB;
  }

  foo();

cleanupA:
  undoA();
cleanupB:
  undoB();
cleanupC:
  undoC();
exit:
  return 0;
}

roundtrips into

unsigned int main();
unsigned int doA();
unsigned int doB();
unsigned int doC();
void foo();
void undoA();
void undoB();
void undoC();
unsigned int main() {
    unsigned int val0;
    unsigned int val1;
    unsigned int val2;
    unsigned int var3;
    var3 = 0U;
    val0 = doA();
    if (val0 != 0U) {
        val1 = doB();
        if (val1 != 0U) {
            val2 = doC();
        }
        if (val1 != 0U && val2 != 0U) {
            foo();
        }
        if (!(val1 == 0U || val2 == 0U || val0 == 0U) || !(val1 != 0U || val0 == 0U)) {
            undoA();
        }
        undoB();
        undoC();
    }
    return 0U;
}

If debug metadata regarding labels is available, Rellic should probably stick to generating jumps instead.

@frabert frabert added enhancement New feature or request decomp Related to LLVM IR to C decompiler user-story labels Oct 19, 2021
@surovic
Copy link
Contributor

surovic commented Oct 19, 2021

I have a feeling this might be a significant issue, since the core concept of rellic's underlying structurization doesn't allow for unstructured control flow i.e. goto.

@pgoodman
Copy link
Contributor

Perhaps there's a way to support this, i.e. via sequencing regions or something, where a goto itself is treated as some kind of black box. This is not a priority feature but one to keep in mind, as C code that uses gotos has often made a deliberate choice to use gotos, and the label names are semantically valuable. Being able to maintain that value seems advantageous.

@surovic
Copy link
Contributor

surovic commented Oct 20, 2021

...and the label names are semantically valuable. Being able to maintain that value seems advantageous.

Agreed.

While it might be possible to somehow incorporate goto statements into the code itself, I think it would be much better to somehow annotate blocks of code near the label with the name of the label or something like that. And then if the code doesn't look as nice, I'd try to figure out how to improve reaching condition simplification. This way we stay in line of the original algorithm and we improve non-goto outputs as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
decomp Related to LLVM IR to C decompiler enhancement New feature or request user-story
Projects
None yet
Development

No branches or pull requests

3 participants