Dataset Viewer
Auto-converted to Parquet Duplicate
instance_id
stringlengths
23
151
repo
stringclasses
70 values
resource_tier
stringclasses
2 values
repo_stars
float64
13
7.3k
repo_forks
float64
3
3.25k
language
stringclasses
1 value
base_commit
stringlengths
40
40
merge_commit
stringlengths
40
40
refactoring_types
listlengths
1
1
refactoring_descriptions
listlengths
0
0
file_path
stringlengths
25
113
source_code_before
stringlengths
60
398k
source_code_after
stringlengths
236
398k
diff
stringlengths
0
38k
is_pure_refactoring
bool
0 classes
compile_before
bool
1 class
compile_after
bool
1 class
tests_pass_before
bool
1 class
tests_pass_after
bool
1 class
FAIL_TO_PASS
listlengths
0
5
PASS_TO_PASS
listlengths
1
58.7k
created_at
stringdate
2018-03-06 21:42:11+0100
2026-05-27 21:12:58+0100
repo_license
stringclasses
4 values
lines_changed
float64
0
311
adorsys__keycloak-config-cli@01c937bf:src/main/java/de/adorsys/keycloak/config/properties/KeycloakConfigProperties.java
adorsys/keycloak-config-cli
high
1,117
188
java
252d801e5bdafd5e7e62d107165c616741211836
01c937bf7d5917357543cbf1271f84c209fc343c
[ "Add Parameter" ]
[]
src/main/java/de/adorsys/keycloak/config/properties/KeycloakConfigProperties.java
/*- * ---license-start * keycloak-config-cli * --- * Copyright (C) 2017 - 2021 adorsys GmbH & Co. KG @ https://adorsys.com * --- * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * ...
/*- * ---license-start * keycloak-config-cli * --- * Copyright (C) 2017 - 2021 adorsys GmbH & Co. KG @ https://adorsys.com * --- * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * ...
diff --git a/src/main/java/de/adorsys/keycloak/config/properties/KeycloakConfigProperties.java b/src/main/java/de/adorsys/keycloak/config/properties/KeycloakConfigProperties.java index a188db21..6697b4df 100644 --- a/src/main/java/de/adorsys/keycloak/config/properties/KeycloakConfigProperties.java +++ b/src/main/java/d...
null
true
true
true
true
[]
[ "de.adorsys.keycloak.config.util.CryptoUtilTest.encryptDecryptWrong", "de.adorsys.keycloak.config.util.CloneUtilTest.shouldNotDeepEqual", "de.adorsys.keycloak.config.service.normalize.AuthFlowNormalizationServiceTest.testExecutionChanged_AuthenticatorConfigDifferent", "de.adorsys.keycloak.config.util.CloneUti...
2026-01-22T17:19:19+01:00
Apache-2.0
15
adorsys__keycloak-config-cli@dd47b855:src/main/java/de/adorsys/keycloak/config/service/UserImportService.java
adorsys/keycloak-config-cli
high
1,117
188
java
9ebcfb69a7eca9dd4f5f30c89b7e780ceb9783d9
dd47b8557bcc982c7060c6f553e57d8b783d7a2a
[ "Remove Method Annotation" ]
[]
src/main/java/de/adorsys/keycloak/config/service/UserImportService.java
/*- * ---license-start * keycloak-config-cli * --- * Copyright (C) 2017 - 2021 adorsys GmbH & Co. KG @ https://adorsys.com * --- * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * ...
/*- * ---license-start * keycloak-config-cli * --- * Copyright (C) 2017 - 2021 adorsys GmbH & Co. KG @ https://adorsys.com * --- * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * ...
diff --git a/src/main/java/de/adorsys/keycloak/config/service/UserImportService.java b/src/main/java/de/adorsys/keycloak/config/service/UserImportService.java index 58a1dc9e..a0f9785f 100644 --- a/src/main/java/de/adorsys/keycloak/config/service/UserImportService.java +++ b/src/main/java/de/adorsys/keycloak/config/serv...
null
true
true
true
true
[]
[ "de.adorsys.keycloak.config.util.CryptoUtilTest.encryptDecryptWrong", "de.adorsys.keycloak.config.util.CloneUtilTest.shouldNotDeepEqual", "de.adorsys.keycloak.config.service.normalize.AuthFlowNormalizationServiceTest.testExecutionChanged_AuthenticatorConfigDifferent", "de.adorsys.keycloak.config.util.CloneUti...
2026-01-22T10:44:03+01:00
Apache-2.0
60
adorsys__keycloak-config-cli@7e5d759d:src/main/java/de/adorsys/keycloak/config/properties/KeycloakConfigProperties.java
adorsys/keycloak-config-cli
high
1,117
188
java
68d50dc92dad027f433bed4f6a68eb59fd7b37b7
7e5d759d220c8a2b3d10cf255c80d39aff9948f5
[ "Add Parameter" ]
[]
src/main/java/de/adorsys/keycloak/config/properties/KeycloakConfigProperties.java
/*- * ---license-start * keycloak-config-cli * --- * Copyright (C) 2017 - 2021 adorsys GmbH & Co. KG @ https://adorsys.com * --- * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * ...
/*- * ---license-start * keycloak-config-cli * --- * Copyright (C) 2017 - 2021 adorsys GmbH & Co. KG @ https://adorsys.com * --- * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * ...
diff --git a/src/main/java/de/adorsys/keycloak/config/properties/KeycloakConfigProperties.java b/src/main/java/de/adorsys/keycloak/config/properties/KeycloakConfigProperties.java index a188db21..6697b4df 100644 --- a/src/main/java/de/adorsys/keycloak/config/properties/KeycloakConfigProperties.java +++ b/src/main/java/d...
null
true
true
true
true
[]
[ "de.adorsys.keycloak.config.util.CryptoUtilTest.encryptDecryptWrong", "de.adorsys.keycloak.config.util.CloneUtilTest.shouldNotDeepEqual", "de.adorsys.keycloak.config.service.normalize.AuthFlowNormalizationServiceTest.testExecutionChanged_AuthenticatorConfigDifferent", "de.adorsys.keycloak.config.util.CloneUti...
2026-01-22T17:19:19+01:00
Apache-2.0
15
"adorsys__keycloak-config-cli@97883103:src/test/java/de/adorsys/keycloak/config/service/AbstractChec(...TRUNCATED)
adorsys/keycloak-config-cli
high
1,117
188
java
215559937ef409aa8158ffbb00aefd5e2f95361d
9788310315f1432d345787f0b70c8edc3f2fa98b
[ "Change Variable Type" ]
[]
src/test/java/de/adorsys/keycloak/config/service/AbstractChecksumServiceIT.java
"/*-\n * ---license-start\n * keycloak-config-cli\n * ---\n * Copyright (C) 2017 - 2025 adorsys GmbH(...TRUNCATED)
"/*-\n * ---license-start\n * keycloak-config-cli\n * ---\n * Copyright (C) 2017 - 2025 adorsys GmbH(...TRUNCATED)
"diff --git a/src/test/java/de/adorsys/keycloak/config/service/AbstractChecksumServiceIT.java b/src/(...TRUNCATED)
null
true
true
true
true
[]
["de.adorsys.keycloak.config.util.CryptoUtilTest.encryptDecryptWrong","de.adorsys.keycloak.config.ut(...TRUNCATED)
2025-10-26T14:59:54+01:00
Apache-2.0
3
"adorsys__keycloak-config-cli@a7ed1b9f:src/test/java/de/adorsys/keycloak/config/service/AbstractChec(...TRUNCATED)
adorsys/keycloak-config-cli
high
1,117
188
java
744678dc2740ce9c4d1a0f926e2a3473b25acd1a
a7ed1b9fe35554a3fcef11c1d84aee7145c13a47
[ "Change Variable Type" ]
[]
src/test/java/de/adorsys/keycloak/config/service/AbstractChecksumServiceIT.java
"/*-\n * ---license-start\n * keycloak-config-cli\n * ---\n * Copyright (C) 2017 - 2025 adorsys GmbH(...TRUNCATED)
"/*-\n * ---license-start\n * keycloak-config-cli\n * ---\n * Copyright (C) 2017 - 2025 adorsys GmbH(...TRUNCATED)
"diff --git a/src/test/java/de/adorsys/keycloak/config/service/AbstractChecksumServiceIT.java b/src/(...TRUNCATED)
null
true
true
true
true
[]
["de.adorsys.keycloak.config.util.CryptoUtilTest.encryptDecryptWrong","de.adorsys.keycloak.config.ut(...TRUNCATED)
2025-10-26T14:59:54+01:00
Apache-2.0
3
"adorsys__keycloak-config-cli@df42b385:src/main/java/de/adorsys/keycloak/config/repository/ClientRep(...TRUNCATED)
adorsys/keycloak-config-cli
high
1,117
188
java
02f6ea11b1f3a32ed7886a207621433f1563e38a
df42b3851fc7f85ad4bf634578133e8e8d433295
[ "Rename Variable" ]
[]
src/main/java/de/adorsys/keycloak/config/repository/ClientRepository.java
"/*-\n * ---license-start\n * keycloak-config-cli\n * ---\n * Copyright (C) 2017 - 2021 adorsys GmbH(...TRUNCATED)
"/*-\n * ---license-start\n * keycloak-config-cli\n * ---\n * Copyright (C) 2017 - 2021 adorsys GmbH(...TRUNCATED)
"diff --git a/src/main/java/de/adorsys/keycloak/config/repository/ClientRepository.java b/src/main/j(...TRUNCATED)
null
true
true
true
true
[]
["de.adorsys.keycloak.config.util.JsonUtilTest.getJsonOrNullNode_shouldNotProduceNullPointer","de.ad(...TRUNCATED)
2024-07-12T13:33:29+02:00
Apache-2.0
19
"adorsys__keycloak-config-cli@55e2e9af:src/main/java/de/adorsys/keycloak/config/service/UserImportSe(...TRUNCATED)
adorsys/keycloak-config-cli
high
1,117
188
java
68d50dc92dad027f433bed4f6a68eb59fd7b37b7
55e2e9af176ecac231ca86444d92fa27f84ef3e7
[ "Remove Method Annotation" ]
[]
src/main/java/de/adorsys/keycloak/config/service/UserImportService.java
"/*-\n * ---license-start\n * keycloak-config-cli\n * ---\n * Copyright (C) 2017 - 2021 adorsys GmbH(...TRUNCATED)
"/*-\n * ---license-start\n * keycloak-config-cli\n * ---\n * Copyright (C) 2017 - 2021 adorsys GmbH(...TRUNCATED)
"diff --git a/src/main/java/de/adorsys/keycloak/config/service/UserImportService.java b/src/main/jav(...TRUNCATED)
null
true
true
true
true
[]
["de.adorsys.keycloak.config.util.CryptoUtilTest.encryptDecryptWrong","de.adorsys.keycloak.config.ut(...TRUNCATED)
2026-01-22T10:44:03+01:00
Apache-2.0
60
"Adyen__adyen-java-api-library@ec38db76:src/main/java/com/adyen/model/legalentitymanagement/LegalEnt(...TRUNCATED)
Adyen/adyen-java-api-library
high
140
153
java
f884f5d58ad1376fad129591e73f77c4147daebe
ec38db76ed99f175ac03090c4dd950e0aeb045cf
[ "Modify Method Annotation" ]
[]
src/main/java/com/adyen/model/legalentitymanagement/LegalEntityAssociation.java
"/*\n * Legal Entity Management API\n *\n * The version of the OpenAPI document: 3\n * \n *\n * NOTE(...TRUNCATED)
"/*\n * Legal Entity Management API\n *\n * The version of the OpenAPI document: 3\n * \n *\n * NOTE(...TRUNCATED)
"diff --git a/src/main/java/com/adyen/model/legalentitymanagement/LegalEntityAssociation.java b/src/(...TRUNCATED)
null
true
true
true
true
[]
["com.adyen.terminal.security.TerminalCommonNameValidatorTest.testSerialize[3]","com.adyen.terminal.(...TRUNCATED)
2025-02-03T09:48:00+01:00
MIT
10
Adyen__adyen-java-api-library@c46cab5c:src/main/java/com/adyen/util/HMACValidator.java
Adyen/adyen-java-api-library
high
140
153
java
c3063f3932aac00758b6fd3d61c9267b4c905bb0
c46cab5c879c6be9e508861b351c1247cb49f533
[ "Reorder Parameter" ]
[]
src/main/java/com/adyen/util/HMACValidator.java
"/*\n * ######\n * ######\n * ############ ####( ####(...TRUNCATED)
"/*\n * ######\n * ######\n * ############ ####( ####(...TRUNCATED)
"diff --git a/src/main/java/com/adyen/util/HMACValidator.java b/src/main/java/com/adyen/util/HMACVal(...TRUNCATED)
null
true
true
true
true
[]
["com.adyen.terminal.security.TerminalCommonNameValidatorTest.testSerialize[3]","com.adyen.terminal.(...TRUNCATED)
2024-01-31T13:26:19+01:00
MIT
11
Adyen__adyen-java-api-library@104e2432:src/main/java/com/adyen/model/nexo/MenuEntry.java
Adyen/adyen-java-api-library
high
140
153
java
22d92e43d424f27948804617f76f1f6d9b0102c9
104e2432fd62eea10b7537464208fe0274ecef7f
[ "Encapsulate Attribute" ]
[]
src/main/java/com/adyen/model/nexo/MenuEntry.java
"package com.adyen.model.nexo;\n\nimport io.swagger.v3.oas.annotations.media.Schema;\n\nimport javax(...TRUNCATED)
"package com.adyen.model.nexo;\n\nimport io.swagger.v3.oas.annotations.media.Schema;\n\nimport javax(...TRUNCATED)
"diff --git a/src/main/java/com/adyen/model/nexo/MenuEntry.java b/src/main/java/com/adyen/model/nexo(...TRUNCATED)
null
true
true
true
true
[]
["com.adyen.terminal.security.TerminalCommonNameValidatorTest.testSerialize[3]","com.adyen.terminal.(...TRUNCATED)
2024-03-07T10:58:32+01:00
MIT
10
End of preview. Expand in Data Studio

CS4570 Java Refactoring Benchmark

A benchmark of Java refactoring commits mined from GitHub repositories, split into high-resource and low-resource tiers based on repository popularity. Built for the TU Delft CS4570 ML for Software Engineering course (2026).

Dataset split

Tier Repositories Instances Star range
High-resource 57 618 1000-7298
Low-resource ~60 198 13-90
Total 816

Use the resource_tier field to filter by tier. repo_stars and repo_forks are included for fine-grained popularity analysis.

Construction

Each instance is a single-file refactoring commit that was:

  1. Detected by RefactoringMiner 3.1.3 as containing at least one Java refactoring operation
  2. Verified by building the project before and after the commit using Docker (Maven/Gradle, JDK 17/21)
  3. Filtered to commits where the project compiles and tests pass both before and after (pure refactorings)

High-resource repos have >=1000 stars and >=10 contributors; low-resource repos have <100 stars.

Fields

Field Type Description
instance_id string Unique identifier
repo string owner/repo identifier
resource_tier string "high" or "low"
repo_stars int or null GitHub star count at collection time
repo_forks int or null GitHub fork count at collection time
language string Always "java"
base_commit string Parent commit SHA
merge_commit string Refactoring commit SHA
refactoring_types list[str] Refactoring type labels from RefactoringMiner
refactoring_descriptions list[str] or null Human-readable descriptions (HR only)
file_path string Java file path that was refactored
source_code_before string File content at base commit
source_code_after string File content at merge commit
diff string Unified diff between before and after
is_pure_refactoring bool or null True if only refactoring operations changed
compile_before bool or null Project compiled at base commit
compile_after bool or null Project compiled at merge commit
tests_pass_before bool or null All tests pass at base commit
tests_pass_after bool or null All tests pass at merge commit
FAIL_TO_PASS list[str] Tests failing before, passing after
PASS_TO_PASS list[str] Tests passing both before and after
created_at string or null Commit timestamp (ISO 8601)
repo_license string or null SPDX license identifier
lines_changed int or null Lines changed in the refactoring

Refactoring types

Top types across both tiers: Extract Method, Rename Method, Rename Variable, Rename Parameter, Extract Variable, Inline Variable, Change Variable Type, Add Parameter, Add Method Annotation, Rename Attribute.

66 unique refactoring types in total.

Usage

from datasets import load_dataset

ds = load_dataset("Maxros/test", split="test")

# Filter by tier
hr = ds.filter(lambda x: x["resource_tier"] == "high")
lr = ds.filter(lambda x: x["resource_tier"] == "low")

Citation

TU Delft CS4570 -- ML for Software Engineering benchmark, 2026.
Downloads last month
122