Request on prompts for the evaluation results.

by sh0416 - opened Mar 4, 2024

Mar 4, 2024

•

edited Mar 4, 2024

I reproduced the result for MBPP+ and the reproduced score is 0.4385, which is different from the reported score (0.474).

I think the difference comes from the prompt that I've used for my experiment.

My prompt is as follow.

Problem: Write a function to find the shared elements from the given two lists.
Test:
assert set(similar_elements((3, 4, 5, 6),(5, 7, 4, 10))) == set((4, 5))
assert set(similar_elements((1, 2, 3, 4),(5, 4, 3, 7))) == set((3, 4))
assert set(similar_elements((11, 12, 14, 13),(17, 15, 14, 13))) == set((13, 14))
Implementation:
```python

What is the prompt for the evaluation?
Thank you

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment