10-26-23 Challenge in Rust2C

Dataset Continued

  • C test cases cannot be trusted to test Rust code
    • Rust use Option (enum) instead of NULL pointer, but Option cannot be safely used in C
    • Validity of Lifetime Annotations cannot be tested in C test cases
    • Of all the pointers in Rust only reference with & and Box can be safely used in C
    • Some C test cases tests the internal implementation details, which is impossible for Rust to align with
  • Thus, naively running C unit tests on translated Rust would not work
  • Possible solution
    • Handwrite test cases
      • Cost to find enough people who knows Rust to provide code
    • Only deal with CLI tools or whole programs that uses some I/O interface
      • Greatly limits possible code space
      • Cannot effectively measure how well C & Rust codes align with each other
    • Find ways to automatically generate equivalent tests from C to rust
      • Ambiguity, though may be solved by listing assumptions at each translation
      • C does not have a universally used tests framework.

Human in the loop required

  • Ambiguity that cannot be inferred from C code alone
    • e.g. Choice of pointers
      • Given some struct defined using template in C

        template <typename T> 
        struct Node { 
          T *next; 
        }; 
        
      • There could be many ways to implement that in Rust, each one has its own trade off.

        // using reference
        // lifetime has to be specified 
        struct Node<'a T> { 
          next: Box<T>; 
        }; 
        
        // pointer with box
        // immutable, owns the object
        struct Node<T> { 
          next: Box<T>; 
        }; 
        
        // pointer with Cell
        // mutable, T has to be Copy
        struct Node<T> { 
          next: Cell<T>; 
        }; 
        
        // pointer with RefCell
        // keep track of references in runtime
        // could panic and could cause problems when dereferencing
        struct Node<T> { 
          next: RefCell<T>; 
        }; 
        
  • Some C code does not have equivalent Safe Rust implementation
  • We must design a framework so user could resolve ambiguity and approve usage of unsafe rust
    • Idea1: Using PPL to achieve constrained LLM generation
    • Idea2: At each time-step, LLM generate a Plan/Rust code with holes and places for users to edit to resolve ambiguity