Datasets:
question_type stringclasses 5
values | image_path listlengths 1 32 | question stringlengths 30 153 | answer stringclasses 72
values | choices stringlengths 16 57 |
|---|---|---|---|---|
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_00/original_images/280.jpg"
] | Could you tell me the location of the counter in comparison to the refrigerator? | A. right | A. right
B. front-up
C. back-left
D. front |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_00/original_images/760.jpg"
] | How is the cabinet positioned with respect to the table? | D. back | A. left
B. front-left
C. above-left
D. back |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_00/original_images/2020.jpg"
] | If you're looking at the counter, where would you find the table? | D. back-right | A. front-right
B. front
C. front-up
D. back-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_00/original_images/2120.jpg"
] | Could you tell me the location of the table in comparison to the television? | B. down | A. above
B. down
C. back-left
D. back-up |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_00/original_images/20.jpg"
] | How is the table positioned with respect to the refrigerator? | C. back-down | A. above
B. front-up
C. back-down
D. back-up |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_00/original_images/840.jpg"
] | If you're looking at the refrigerator, where would you find the television? | B. right | A. front-down
B. right
C. back-down
D. above-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_00/original_images/2300.jpg"
] | How is the counter positioned with respect to the sink? | C. back-right | A. front-left
B. back-left
C. back-right
D. down-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_00/original_images/2200.jpg"
] | Could you tell me the location of the window in comparison to the counter? | B. right | A. front-down
B. right
C. back-left
D. left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_00/original_images/1960.jpg"
] | Could you tell me the location of the television in comparison to the refrigerator? | D. right | A. back
B. front-up
C. back-left
D. right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_00/original_images/1400.jpg"
] | Where is the refrigerator in relation to the cabinet? | C. down | A. back
B. left
C. down
D. right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_00/original_images/1000.jpg"
] | If you're looking at the counter, where would you find the refrigerator? | A. front-right | A. front-right
B. above-left
C. left
D. above |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_00/original_images/1300.jpg"
] | Where is the refrigerator located compared to the counter from the camera's perspective? | D. front-up | A. down-right
B. above-right
C. front-left
D. front-up |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_00/original_images/1200.jpg"
] | Could you tell me the location of the television in comparison to the refrigerator? | A. right | A. right
B. down-left
C. back-down
D. front-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_01/original_images/1640.jpg"
] | Can you describe the position of the refrigerator relative to the sink? | B. front-right | A. back-down
B. front-right
C. front-down
D. front-up |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_01/original_images/60.jpg"
] | How is the refrigerator positioned with respect to the cabinet? | A. front-right | A. front-right
B. back-down
C. down
D. back-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_01/original_images/440.jpg"
] | Where is the chair in relation to the television? | D. back-down | A. back-up
B. front-down
C. front
D. back-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_01/original_images/1120.jpg"
] | Where is the window in relation to the sink? | D. above | A. front-down
B. front
C. back-right
D. above |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_01/original_images/880.jpg"
] | Where is the chair located compared to the counter from the camera's perspective? | A. right | A. right
B. back-down
C. above-left
D. left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_01/original_images/2300.jpg"
] | Where is the window in relation to the sink? | D. above | A. back-up
B. back-right
C. front
D. above |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_01/original_images/80.jpg"
] | Where is the refrigerator in relation to the television? | A. left | A. left
B. down-right
C. down
D. front-up |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_01/original_images/1400.jpg"
] | Where is the window located compared to the sink from the camera's perspective? | A. above | A. above
B. front
C. down-left
D. back-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_01/original_images/780.jpg"
] | Can you describe the position of the table relative to the door? | B. back-down | A. above-right
B. back-down
C. left
D. above-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_01/original_images/1840.jpg"
] | How is the refrigerator positioned with respect to the counter? | A. front-right | A. front-right
B. back-left
C. back-right
D. above-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_01/original_images/1320.jpg"
] | Could you tell me the location of the counter in comparison to the window? | B. right | A. down-left
B. right
C. back-left
D. above-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_01/original_images/1680.jpg"
] | Where is the refrigerator in relation to the sink? | C. front-right | A. above
B. down-left
C. front-right
D. back |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0011_01/original_images/1200.jpg"
] | If you're looking at the refrigerator, where would you find the table? | A. back-down | A. back-down
B. back-left
C. front-right
D. front-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0019_01/original_images/200.jpg"
] | Where is the door located compared to the lamp from the camera's perspective? | B. right | A. down-left
B. right
C. down
D. front-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0019_01/original_images/180.jpg"
] | Where is the door located compared to the lamp from the camera's perspective? | C. right | A. above
B. back-up
C. right
D. back-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0019_01/original_images/480.jpg"
] | Could you tell me the location of the sofa in comparison to the pillow? | B. right | A. front
B. right
C. back-up
D. front-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/0.jpg"
] | How is the blinds positioned with respect to the shelves? | C. above-left | A. right
B. back-down
C. above-left
D. front-up |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/1220.jpg"
] | Could you tell me the location of the desk in comparison to the chair? | A. right | A. right
B. front
C. back-down
D. above-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/1160.jpg"
] | Could you tell me the location of the chair in comparison to the books? | D. back-down | A. down-left
B. front-down
C. front-up
D. back-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/1060.jpg"
] | Where is the chair located compared to the books from the camera's perspective? | D. back-down | A. above
B. right
C. front-down
D. back-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/1120.jpg"
] | How is the window positioned with respect to the books? | B. above-right | A. front-up
B. above-right
C. back-left
D. front |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/980.jpg"
] | Where is the chair located compared to the shelves from the camera's perspective? | D. back | A. left
B. down
C. front
D. back |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/1240.jpg"
] | How is the cabinet positioned with respect to the books? | D. front-down | A. back-down
B. front-right
C. front-left
D. front-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/820.jpg"
] | Where is the sofa in relation to the cabinet? | A. right | A. right
B. front-left
C. front-up
D. front-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/900.jpg"
] | Where is the chair in relation to the cabinet? | C. left | A. above-right
B. above
C. left
D. down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/920.jpg"
] | Can you describe the position of the chair relative to the blinds? | A. down-right | A. down-right
B. back-left
C. left
D. back-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/1760.jpg"
] | If you're looking at the shelves, where would you find the chair? | D. back-down | A. above
B. back-left
C. right
D. back-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/500.jpg"
] | Could you tell me the location of the sofa in comparison to the desk? | D. front-left | A. down-right
B. above
C. down
D. front-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/1020.jpg"
] | Could you tell me the location of the blinds in comparison to the books? | C. right | A. back-up
B. front-up
C. right
D. front-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/1420.jpg"
] | If you're looking at the shelves, where would you find the chair? | A. right | A. right
B. front-up
C. front-left
D. down-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/1820.jpg"
] | How is the cabinet positioned with respect to the books? | B. down-right | A. above
B. down-right
C. above-left
D. back-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/380.jpg"
] | Where is the sofa located compared to the pillow from the camera's perspective? | D. down-right | A. front-right
B. above
C. front-down
D. down-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/940.jpg"
] | If you're looking at the shelves, where would you find the chair? | D. back-right | A. front-right
B. down-right
C. above
D. back-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/180.jpg"
] | Could you tell me the location of the sofa in comparison to the desk? | A. left | A. left
B. down
C. back-right
D. down-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/1720.jpg"
] | How is the chair positioned with respect to the window? | D. down-right | A. front-down
B. above-right
C. above
D. down-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_00/original_images/1200.jpg"
] | Where is the chair located compared to the cabinet from the camera's perspective? | C. back | A. front-right
B. right
C. back
D. front-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/720.jpg"
] | How is the sofa positioned with respect to the books? | B. left | A. right
B. left
C. front
D. above |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/1640.jpg"
] | Could you tell me the location of the cabinet in comparison to the chair? | D. front | A. left
B. right
C. back-right
D. front |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/240.jpg"
] | Where is the cabinet located compared to the table from the camera's perspective? | C. left | A. back-right
B. front-down
C. left
D. down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/1600.jpg"
] | Can you describe the position of the chair relative to the box? | B. left | A. back
B. left
C. front
D. right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/660.jpg"
] | Where is the books in relation to the window? | D. down-left | A. front-down
B. back-left
C. back-up
D. down-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/1220.jpg"
] | Could you tell me the location of the sofa in comparison to the pillow? | C. right | A. front-down
B. front-left
C. right
D. down-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/300.jpg"
] | Where is the table located compared to the chair from the camera's perspective? | B. left | A. front-down
B. left
C. back-down
D. front-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/260.jpg"
] | How is the blinds positioned with respect to the box? | C. front-up | A. down
B. above-right
C. front-up
D. front-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/1660.jpg"
] | Where is the cabinet in relation to the box? | C. left | A. front-right
B. front-up
C. left
D. right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/880.jpg"
] | How is the table positioned with respect to the chair? | D. left | A. down
B. front-down
C. back-down
D. left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/40.jpg"
] | Can you describe the position of the sofa relative to the pillow? | C. right | A. front
B. back
C. right
D. front-up |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/920.jpg"
] | If you're looking at the box, where would you find the cabinet? | C. front-left | A. down
B. down-left
C. front-left
D. back-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/1760.jpg"
] | Could you tell me the location of the box in comparison to the shelves? | A. back | A. back
B. above-left
C. front-left
D. above |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/860.jpg"
] | Where is the chair in relation to the shelves? | C. back-down | A. back-up
B. back-left
C. back-down
D. above |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/1520.jpg"
] | Could you tell me the location of the sofa in comparison to the pillow? | C. right | A. front-up
B. down
C. right
D. down-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/460.jpg"
] | If you're looking at the books, where would you find the sofa? | B. left | A. front-up
B. left
C. above-right
D. back-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/1140.jpg"
] | How is the cabinet positioned with respect to the chair? | B. right | A. down
B. right
C. front-down
D. front-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/1400.jpg"
] | Where is the shelves located compared to the window from the camera's perspective? | C. down-left | A. back-right
B. front-left
C. down-left
D. front-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/800.jpg"
] | Where is the table located compared to the cabinet from the camera's perspective? | C. left | A. front-right
B. front
C. left
D. down-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/340.jpg"
] | If you're looking at the cabinet, where would you find the sofa? | D. right | A. above
B. left
C. back-left
D. right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/1680.jpg"
] | Can you describe the position of the box relative to the shelves? | C. back-left | A. front-left
B. back-down
C. back-left
D. down-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/1300.jpg"
] | How is the books positioned with respect to the pillow? | A. left | A. left
B. down
C. back
D. down-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_01/original_images/1200.jpg"
] | If you're looking at the pillow, where would you find the sofa? | A. right | A. right
B. front-up
C. down-left
D. back-up |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/1040.jpg"
] | Can you describe the position of the box relative to the cabinet? | B. right | A. front-down
B. right
C. back-up
D. front-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/320.jpg"
] | Where is the chair in relation to the blinds? | D. back-down | A. front-down
B. down-left
C. left
D. back-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/440.jpg"
] | Can you describe the position of the cabinet relative to the books? | D. down | A. back-up
B. above-right
C. back-left
D. down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/220.jpg"
] | Where is the cabinet in relation to the chair? | B. back-right | A. above-right
B. back-right
C. above
D. front-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/1060.jpg"
] | Where is the box located compared to the cabinet from the camera's perspective? | A. right | A. right
B. front-down
C. down
D. down-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/580.jpg"
] | Where is the whiteboard in relation to the sofa? | B. above-right | A. down-left
B. above-right
C. front-left
D. down-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/640.jpg"
] | Could you tell me the location of the door in comparison to the whiteboard? | C. left | A. back-down
B. back-up
C. left
D. above-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/840.jpg"
] | Where is the sofa in relation to the pillow? | D. right | A. back-down
B. front
C. down-left
D. right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/400.jpg"
] | Where is the box located compared to the cabinet from the camera's perspective? | D. back-right | A. above-left
B. down-left
C. above
D. back-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/1340.jpg"
] | Can you describe the position of the sofa relative to the shelves? | C. down-left | A. front-right
B. above
C. down-left
D. front |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/620.jpg"
] | Can you describe the position of the door relative to the whiteboard? | A. left | A. left
B. above-right
C. front
D. back-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/140.jpg"
] | Could you tell me the location of the sofa in comparison to the cabinet? | D. right | A. back-up
B. front-left
C. back-down
D. right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/1460.jpg"
] | Could you tell me the location of the door in comparison to the shelves? | A. right | A. right
B. back-left
C. back-up
D. down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/200.jpg"
] | Can you describe the position of the cabinet relative to the chair? | A. right | A. right
B. front-left
C. back-up
D. front-up |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/340.jpg"
] | Can you describe the position of the chair relative to the box? | A. right | A. right
B. down
C. back-up
D. front-left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/1000.jpg"
] | Could you tell me the location of the box in comparison to the cabinet? | C. back-right | A. above-right
B. front-left
C. back-right
D. front-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/940.jpg"
] | If you're looking at the chair, where would you find the books? | A. front-up | A. front-up
B. back
C. back-down
D. back-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/560.jpg"
] | How is the whiteboard positioned with respect to the sofa? | B. above-right | A. front-left
B. above-right
C. down
D. front |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0025_02/original_images/540.jpg"
] | Where is the whiteboard located compared to the sofa from the camera's perspective? | A. above-right | A. above-right
B. back-down
C. back-left
D. down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0030_00/original_images/2360.jpg"
] | Can you describe the position of the paper relative to the bag? | B. above-left | A. back-left
B. above-left
C. front-left
D. down-right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0030_00/original_images/1960.jpg"
] | Could you tell me the location of the bookshelf in comparison to the paper? | B. left | A. down-right
B. left
C. back-down
D. front-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0030_00/original_images/1800.jpg"
] | Could you tell me the location of the paper in comparison to the window? | C. left | A. above-right
B. front-up
C. left
D. back-down |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0030_00/original_images/340.jpg"
] | Where is the paper located compared to the bag from the camera's perspective? | A. above-left | A. above-left
B. back-right
C. back-up
D. front-up |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0030_00/original_images/2000.jpg"
] | Could you tell me the location of the bookshelf in comparison to the paper? | A. front-left | A. front-left
B. above-right
C. back-right
D. right |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0030_00/original_images/1320.jpg"
] | Can you describe the position of the chair relative to the bookshelf? | C. right | A. above-left
B. back-down
C. right
D. back-up |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0030_01/original_images/660.jpg"
] | Where is the bag located compared to the box from the camera's perspective? | C. right | A. down-left
B. front
C. right
D. left |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0030_01/original_images/1340.jpg"
] | Where is the bookshelf located compared to the chair from the camera's perspective? | A. front-left | A. front-left
B. right
C. above-right
D. above |
Camera perspective - Relative Direction | [
"ViewSpatial-Bench/scannetv2_val/scene0030_02/original_images/300.jpg"
] | Could you tell me the location of the box in comparison to the bag? | A. front | A. front
B. above-right
C. left
D. back-up |
ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models
Dataset Description
We introduce ViewSpatial-Bench, a comprehensive benchmark with over 5,700 question-answer pairs across 1,000+ 3D scenes from ScanNet and MS-COCO validation sets. This benchmark evaluates VLMs' spatial localization capabilities from multiple perspectives, specifically testing both egocentric (camera) and allocentric (human subject) viewpoints across five distinct task types.
ViewSpatial-Bench addresses a critical gap: while VLMs excel at spatial reasoning from their own perspective, they struggle with perspective-taking—adopting another entity's spatial frame of reference—which is essential for embodied interaction and multi-agent collaboration. The figure below shows the construction pipeline and example demonstrations of our benchmark.
The dataset contains the following fields:
| Field Name | Description |
|---|---|
question_type |
Type of spatial reasoning task, includes 5 distinct categories for evaluating different spatial capabilities |
image_path |
Path to the source image, includes data from two sources: scannetv2_val (ScanNet validation set) and val2017 (MS-COCO validation set) |
question |
The spatial reasoning question posed to the model |
answer |
The correct answer to the question |
choices |
Multiple choice options available for the question |
- Language(s) (NLP): en
- License: apache-2.0
Uses
With HuggingFace datasets library.
from datasets import load_dataset
ds = load_dataset("lidingm/ViewSpatial-Bench")
Benchmark
We provide benchmark results for various models on our benchmark. More model evaluations will be added.
| Model | Camera-based Tasks | Person-based Tasks | Overall | |||||
|---|---|---|---|---|---|---|---|---|
| Rel. Dir. | Obj. Ori. | Avg. | Obj. Ori. | Rel. Dir. | Sce. Sim. | Avg. | ||
| Proprietary Models | ||||||||
| GPT-4o | 41.46 | 19.58 | 33.57 | 42.97 | 40.86 | 26.79 | 36.29 | 34.98 |
| Gemini-2.0-Flash | 45.29 | 12.95 | 33.66 | 41.16 | 32.78 | 21.90 | 31.53 | 32.56 |
| GPT-5-mini | 56.97 | 27.41 | 46.34 | 43.98 | 49.29 | 26.06 | 38.77 | 42.44 |
| Gemini-2.5-Flash | 52.62 | 23.09 | 42.00 | 42.97 | 42.16 | 20.27 | 34.22 | 37.99 |
| Gemini-2.5-Pro | 58.71 | 32.73 | 49.37 | 48.59 | 45.84 | 25.79 | 39.24 | 44.15 |
| Gemini-3.0-Flash | 62.94 | 35.54 | 53.08 | 44.88 | 60.69 | 26.24 | 42.40 | 47.58 |
| GLM-4.6v | 56.35 | 36.35 | 49.16 | 48.90 | 47.39 | 23.44 | 38.91 | 43.87 |
| Doubao-Seed-1.8 | 62.10 | 45.28 | 56.05 | 44.98 | 62.47 | 33.67 | 45.74 | 50.74 |
| Doubao-Seed-2.0 | 65.60 | 44.78 | 58.11 | 47.19 | 72.09 | 33.57 | 49.20 | 53.52 |
| Open-Source General Models | ||||||||
| InternVL2.5 (2B) | 38.52 | 22.59 | 32.79 | 47.09 | 40.02 | 25.70 | 37.04 | 34.98 |
| Qwen3-VL (4B) | 46.98 | 28.01 | 40.16 | 45.68 | 29.22 | 17.74 | 30.48 | 35.17 |
| Qwen2.5-VL (7B) | 46.64 | 29.72 | 40.56 | 37.05 | 35.04 | 28.78 | 33.37 | 36.85 |
| LLaVA-NeXT-Video (7B) | 26.34 | 19.28 | 23.80 | 44.68 | 38.60 | 29.05 | 37.07 | 30.64 |
| LLaVA-OneVision (7B) | 29.84 | 26.10 | 28.49 | 22.39 | 31.00 | 26.88 | 26.54 | 27.49 |
| InternVL2.5 (8B) | 49.41 | 41.27 | 46.48 | 46.79 | 42.04 | 32.85 | 40.20 | 43.24 |
| Qwen3-VL (8B) | 54.60 | 30.32 | 45.87 | 45.28 | 35.75 | 26.79 | 35.61 | 40.58 |
| Llama-3.2-Vision (11B) | 25.27 | 20.98 | 23.73 | 51.20 | 32.19 | 18.82 | 33.61 | 28.82 |
| InternVL3 (14B) | 54.65 | 33.63 | 47.09 | 33.43 | 37.05 | 31.86 | 33.88 | 40.28 |
| Kimi-VL-Instruct (16B) | 26.85 | 22.09 | 25.14 | 63.05 | 43.94 | 20.27 | 41.52 | 33.58 |
| Qwen2.5-VL (32B) | 39.03 | 29.92 | 35.75 | 36.45 | 34.68 | 21.09 | 30.18 | 32.88 |
| Qwen2.5-VL (72B) | 50.65 | 26.71 | 42.04 | 42.17 | 42.76 | 24.80 | 35.82 | 38.83 |
| Qwen3-VL-Thinking (235B) | 59.73 | 36.95 | 51.54 | 43.67 | 48.93 | 31.67 | 40.67 | 45.94 |
| Qwen3.5-Plus (397B) | 62.21 | 38.65 | 53.74 | 50.20 | 68.17 | 38.37 | 50.90 | 52.28 |
| Multi-View Spatial Fine-Tuning | ||||||||
| Qwen2.5-VL (3B) | 43.43 | 33.33 | 39.80 | 39.16 | 28.62 | 28.51 | 32.14 | 35.85 |
| +SFT | 83.59 | 87.65 | 85.05 | 90.16 | 71.14 | 75.75 | 79.31 | 82.09 |
| Improvement over backbone | +40.16 | +54.32 | +45.25 | +51.00 | +42.52 | +47.24 | +47.17 | +46.24 |
| Random Baseline | 25.16 | 26.10 | 25.50 | 24.60 | 31.12 | 26.33 | 27.12 | 26.33 |
- Downloads last month
- 571