-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Thank you for your excellent work! I have some questions regarding the dataset.
Is there a detailed introduction section for the dataset? What does each key in a sample dict represent? Specifically, could you clarify what the target code and the fastest code in the test data represent? For example, is the target code the best solution submitted by the user who also provided the source code, or is it the best solution submitted by any user for this problem? Similarly, how is the fastest code defined?
Additionally, do I need to rerun all evaluation results on the test set, including the source code and target code? I'm concerned that the CPU time of the code might differ from your reported results, even with the provided Docker environment, as evaluations may run on different devices.