Serialization & Breakpoint Recovery

Definition

Breakpoint recovery refers to recording different increments of the program module and storing them in files during program execution. When the program encounters external interruptions (e.g., Ctrl-C) or internal execution exceptions (e.g., LLM API network abnormalities), it can resume execution from the last saved state instead of starting over. This reduces time and cost for developers.

Serialization and Deserialization

To support breakpoint recovery, program outputs must be structured and stored (serialized) for subsequent recovery operations. Serialization logic differs by module functionality:

Static Data: Serialized once during initialization.
Dynamic Data: Serialized in real-time during execution to ensure data integrity.

Serialization occurs when exceptions are encountered or at the end of normal execution.

Implementation Logic

Possible Interruptions:

Network issues causing LLM API call failures after retries.
Parsing errors during action execution.
Manual interruptions (e.g., Ctrl-C).

Serialized Storage Structure:

Data is stored in an integrated JSON file to minimize the impact of future changes.

Example File Structure:

./workspace
  storage
    team
      team.json  # Contains information about the team, environment, roles, actions, etc.

Recovery Execution Order

Scenario 1:

Role A exits abnormally during action selection.
Role B resumes execution based on unprocessed messages and continues reacting.

Scenario 2:

Role A completes its action, but Role B encounters an error during its second action.
Recovery starts with Role B resuming from the failed action.

Re-Execution Logic:

Messages: Messages act as communication bridges between roles. Interrupted messages are reloaded and re-executed.
Actions: Execution resumes from the exact action that was interrupted, maintaining granularity and order.

Command for Breakpoint Recovery

strataai "xxx" --recover_path "./workspace/storage/team"

Serialized data is saved to ./workspace/storage/team by default.

Example Log Output:

2023-12-19 10:26:12.516 | DEBUG | strataai.team:run:101 - max n_round=3 left.
2023-12-19 10:26:12.517 | DEBUG | strataai.roles.role:run:517 - RoleA(Role A): no news. waiting.
2023-12-19 10:26:12.518 | INFO  | strataai.roles.role:_act:373 - RoleB(Role B): ready to ActionRaise
2023-12-19 10:26:12.519 | ERROR | strataai.utils.utils:wrapper:79 - Exception occurs, start to serialize the project, exp:
...

PreviousDevelopment NextImplementation Logic

Last updated 1 year ago