Uploaded image for project: 'Apache Submarine'
  1. Apache Submarine
  2. SUBMARINE-1378

The current state of the experiment should be further refined

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • experiment
    • None

    Description

      In some exceptions (e.g. mirror cannot be downloaded), submarine cannot listen to the actual task status and is always running now.

      For example, in the case of a image that cannot be pulled, the actual job status is as follows.

      status:
        conditions:
          - lastProbeTime: '2023-04-01T03:50:53Z'
            reason: PodInitializing
            type: Waiting
          - lastProbeTime: '2023-04-01T03:50:39Z'
            message: >-
              rpc error: code = Unknown desc = error pulling image configuration: Get
              "https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/5c/5ccab874feb97b32099f72978f97c8e7d129fbe7577464ad49b43f58f693ca90/data?verify=1680324025-7lKdJkTa1waOdofNoPtnsjwv%2FIQ%3D":
              EOF
            reason: ErrImagePull
            type: Waiting
          - lastProbeTime: '2023-04-01T03:49:58Z'
            message: >-
              Back-off pulling image
              "apache/submarine:jupyter-notebook-0.8.0-SNAPSHOT"
            reason: ImagePullBackOff
            type: Waiting
          - lastProbeTime: '2023-04-01T03:49:57Z'
            message: >-
              rpc error: code = Unknown desc = Error response from daemon: Head
              "https://registry-1.docker.io/v2/apache/submarine/manifests/jupyter-notebook-0.8.0-SNAPSHOT":
              Get
              "https://auth.docker.io/token?scope=repository%3Aapache%2Fsubmarine%3Apull&service=registry.docker.io":
              EOF
            reason: ErrImagePull
            type: Waiting
          - lastProbeTime: '2023-04-01T03:49:54Z'
            reason: PodInitializing
            type: Waiting
        containerState:
          waiting:
            reason: PodInitializing
        readyReplicas: 0
      

      Therefore, we should refine the status a bit more.

      Attachments

        Activity

          People

            Unassigned Unassigned
            chenxiang cdmikechen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: