SantaCoder Hyperparameter Insights

#30

by nandovallec - opened May 4, 2023

May 4, 2023

Hello everyone,
I'm currently in the process of doing my master's thesis about LLM applied to code. In my experiments, I am comparing the code summarization and code synthetization tasks in Java for multiple models. Since one of the models I will be using is InCoder, I thought the use of SantaCoder could be interesting to further prove the claim that they are comparable.

I would like to ask if you have some advice on the use of SantaCoder for these tasks. In the sense of hyperparameters or prompt engineering. I checked the SantaCoder demo where I realized that you set a low temperature for the infill tasks but not for the code generation one. Therefore, I was wondering if you have any other insights that may improve the performance of the model.

Best regards.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment