I’ve encountered a problem that has been bothering me for a long time:
In the initialization function env_launcher.load_and_setup_env, I noticed that the setup step for each app performs some UI interactions — for example, for Chrome it executes controller.click_element("Accept & continue") and similar operations.
However, during actual evaluation, I found that these operations seem to be executed again from scratch. After inspecting the code, I believe the issue lies in the initialize_task function in android_world\task_evals\single\browser.py, which runs the following:
user_data_generation.clear_device_storage(env)
chrome_activity = adb_utils.extract_package_name(
adb_utils.get_adb_activity('chrome')
)
adb_utils.clear_app_data(
chrome_activity,
env.controller,
)
This means the app’s state is completely reset.
So my question is:
What is the point of performing setup operations like click_element("Accept & continue") during the setup phase if the app data is cleared right before the actual task evaluation starts?
Is it normal/expected behavior that these setup clicks are executed again during evaluation?
Thank you very much for your reply!
I’ve encountered a problem that has been bothering me for a long time:
In the initialization function
env_launcher.load_and_setup_env, I noticed that the setup step for each app performs some UI interactions — for example, for Chrome it executescontroller.click_element("Accept & continue")and similar operations.However, during actual evaluation, I found that these operations seem to be executed again from scratch. After inspecting the code, I believe the issue lies in the
initialize_taskfunction inandroid_world\task_evals\single\browser.py, which runs the following:This means the app’s state is completely reset.
So my question is:
What is the point of performing setup operations like
click_element("Accept & continue")during the setup phase if the app data is cleared right before the actual task evaluation starts?Is it normal/expected behavior that these setup clicks are executed again during evaluation?
Thank you very much for your reply!