Uploaded image for project: 'Apache Submarine'
  1. Apache Submarine
  2. SUBMARINE-1154

tf-job-operator can not create experiment pod in openshift

    XMLWordPrintableJSON

Details

    Description

      I use OKD 4.7 to run submarine. When I created an experiment, I can not find pod started successfully.

      I checked tf-job-operator log and found that some problems were encountered when operator creating pod.

       

      {"filename":"app/server.go:76","level":"info","msg":"EnvKubeflowNamespace not set, use default namespace","time":"2021-12-21T08:17:42Z"}
      {"filename":"app/server.go:80","level":"info","msg":"Using cluster scoped operator","time":"2021-12-21T08:17:42Z"}
      {"filename":"app/server.go:86","level":"info","msg":"[API Version: v1 Version: v0.1.0-alpha Git SHA: Not provided. Go Version: go1.13.5 Go OS/Arch: linux/amd64]","time":"2021-12-21T08:17:42Z"}
      W1221 08:17:42.318664       1 client_config.go:548] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
      {"filename":"tf-operator.v1/main.go:42","level":"info","msg":"Setting up client for monitoring on port: 8443","time":"2021-12-21T08:17:42Z"}
      {"filename":"tensorflow/controller.go:120","level":"info","msg":"Creating TFJob controller","time":"2021-12-21T08:17:42Z"}
      {"filename":"tensorflow/controller.go:127","level":"info","msg":"Creating Job controller","time":"2021-12-21T08:17:42Z"}
      I1221 08:17:42.377440       1 leaderelection.go:187] attempting to acquire leader lease  default/tf-operator...
      {"filename":"tensorflow/job.go:94","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"TFJob experiment-1640073634042-0001 is created.","time":"2021-12-21T08:17:42Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      I1221 08:18:08.422211       1 leaderelection.go:196] successfully acquired lease default/tf-operator
      {"filename":"tensorflow/controller.go:187","level":"info","msg":"Starting TFJob controller","time":"2021-12-21T08:18:08Z"}
      {"filename":"tensorflow/controller.go:190","level":"info","msg":"Waiting for informer caches to sync","time":"2021-12-21T08:18:08Z"}
      {"filename":"tensorflow/controller.go:196","level":"info","msg":"Starting 1 workers","time":"2021-12-21T08:18:08Z"}
      {"filename":"tensorflow/controller.go:202","level":"info","msg":"Started workers","time":"2021-12-21T08:18:08Z"}
      {"filename":"tensorflow/controller.go:339","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Reconcile TFJobs experiment-1640073634042-0001","time":"2021-12-21T08:18:08Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/pod.go:80","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Need to create new pod: worker-0","replica-type":"worker","time":"2021-12-21T08:18:08Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:470","job":"submarine.experiment-1640073634042-0001","level":"warning","msg":"reconcilePods error pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:290","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Finished syncing tfjob \"submarine/experiment-1640073634042-0001\" (859.732664ms)","time":"2021-12-21T08:18:09Z"}
      E1221 08:18:09.383188       1 controller.go:266] error syncing tfjob: pods "experiment-1640073634042-0001-worker-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
      {"filename":"record/event.go:221","level":"info","msg":"Event(v1.ObjectReference{Kind:\"TFJob\", Namespace:\"submarine\", Name:\"experiment-1640073634042-0001\", UID:\"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4\", APIVersion:\"kubeflow.org/v1\", ResourceVersion:\"287590271\", FieldPath:\"\"}): type: 'Warning' reason: 'FailedCreatePod' Error creating: pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:09Z"}
      {"filename":"tensorflow/controller.go:339","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Reconcile TFJobs experiment-1640073634042-0001","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/pod.go:80","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Need to create new pod: worker-0","replica-type":"worker","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:470","job":"submarine.experiment-1640073634042-0001","level":"warning","msg":"reconcilePods error pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:290","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Finished syncing tfjob \"submarine/experiment-1640073634042-0001\" (271.610488ms)","time":"2021-12-21T08:18:09Z"}
      E1221 08:18:09.664630       1 controller.go:266] error syncing tfjob: pods "experiment-1640073634042-0001-worker-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
      {"filename":"record/event.go:221","level":"info","msg":"Event(v1.ObjectReference{Kind:\"TFJob\", Namespace:\"submarine\", Name:\"experiment-1640073634042-0001\", UID:\"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4\", APIVersion:\"kubeflow.org/v1\", ResourceVersion:\"287590271\", FieldPath:\"\"}): type: 'Warning' reason: 'FailedCreatePod' Error creating: pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:09Z"}
      {"filename":"tensorflow/controller.go:339","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Reconcile TFJobs experiment-1640073634042-0001","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/pod.go:80","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Need to create new pod: worker-0","replica-type":"worker","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:470","job":"submarine.experiment-1640073634042-0001","level":"warning","msg":"reconcilePods error pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:290","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Finished syncing tfjob \"submarine/experiment-1640073634042-0001\" (16.622312ms)","time":"2021-12-21T08:18:09Z"}
      E1221 08:18:09.695125       1 controller.go:266] error syncing tfjob: pods "experiment-1640073634042-0001-worker-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
      {"filename":"record/event.go:221","level":"info","msg":"Event(v1.ObjectReference{Kind:\"TFJob\", Namespace:\"submarine\", Name:\"experiment-1640073634042-0001\", UID:\"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4\", APIVersion:\"kubeflow.org/v1\", ResourceVersion:\"287590271\", FieldPath:\"\"}): type: 'Warning' reason: 'FailedCreatePod' Error creating: pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:09Z"}
      {"filename":"tensorflow/controller.go:339","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Reconcile TFJobs experiment-1640073634042-0001","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/pod.go:80","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Need to create new pod: worker-0","replica-type":"worker","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:470","job":"submarine.experiment-1640073634042-0001","level":"warning","msg":"reconcilePods error pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:290","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Finished syncing tfjob \"submarine/experiment-1640073634042-0001\" (20.935691ms)","time":"2021-12-21T08:18:09Z"}
      E1221 08:18:09.742111       1 controller.go:266] error syncing tfjob: pods "experiment-1640073634042-0001-worker-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
      {"filename":"record/event.go:221","level":"info","msg":"Event(v1.ObjectReference{Kind:\"TFJob\", Namespace:\"submarine\", Name:\"experiment-1640073634042-0001\", UID:\"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4\", APIVersion:\"kubeflow.org/v1\", ResourceVersion:\"287590271\", FieldPath:\"\"}): type: 'Warning' reason: 'FailedCreatePod' Error creating: pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:09Z"}
      {"filename":"tensorflow/controller.go:339","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Reconcile TFJobs experiment-1640073634042-0001","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/pod.go:80","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Need to create new pod: worker-0","replica-type":"worker","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:470","job":"submarine.experiment-1640073634042-0001","level":"warning","msg":"reconcilePods error pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:290","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Finished syncing tfjob \"submarine/experiment-1640073634042-0001\" (126.441784ms)","time":"2021-12-21T08:18:09Z"}
      E1221 08:18:09.909122       1 controller.go:266] error syncing tfjob: pods "experiment-1640073634042-0001-worker-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
      {"filename":"record/event.go:221","level":"info","msg":"Event(v1.ObjectReference{Kind:\"TFJob\", Namespace:\"submarine\", Name:\"experiment-1640073634042-0001\", UID:\"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4\", APIVersion:\"kubeflow.org/v1\", ResourceVersion:\"287590271\", FieldPath:\"\"}): type: 'Warning' reason: 'FailedCreatePod' Error creating: pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:09Z"}
      {"filename":"tensorflow/controller.go:339","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Reconcile TFJobs experiment-1640073634042-0001","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/pod.go:80","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Need to create new pod: worker-0","replica-type":"worker","time":"2021-12-21T08:18:09Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:470","job":"submarine.experiment-1640073634042-0001","level":"warning","msg":"reconcilePods error pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:10Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:290","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Finished syncing tfjob \"submarine/experiment-1640073634042-0001\" (37.721013ms)","time":"2021-12-21T08:18:10Z"}
      E1221 08:18:10.027525       1 controller.go:266] error syncing tfjob: pods "experiment-1640073634042-0001-worker-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
      {"filename":"record/event.go:221","level":"info","msg":"Event(v1.ObjectReference{Kind:\"TFJob\", Namespace:\"submarine\", Name:\"experiment-1640073634042-0001\", UID:\"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4\", APIVersion:\"kubeflow.org/v1\", ResourceVersion:\"287590271\", FieldPath:\"\"}): type: 'Warning' reason: 'FailedCreatePod' Error creating: pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:10Z"}
      {"filename":"tensorflow/controller.go:339","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Reconcile TFJobs experiment-1640073634042-0001","time":"2021-12-21T08:18:10Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/pod.go:80","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Need to create new pod: worker-0","replica-type":"worker","time":"2021-12-21T08:18:10Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:470","job":"submarine.experiment-1640073634042-0001","level":"warning","msg":"reconcilePods error pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:10Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:290","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Finished syncing tfjob \"submarine/experiment-1640073634042-0001\" (36.386638ms)","time":"2021-12-21T08:18:10Z"}
      E1221 08:18:10.224908       1 controller.go:266] error syncing tfjob: pods "experiment-1640073634042-0001-worker-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
      {"filename":"record/event.go:221","level":"info","msg":"Event(v1.ObjectReference{Kind:\"TFJob\", Namespace:\"submarine\", Name:\"experiment-1640073634042-0001\", UID:\"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4\", APIVersion:\"kubeflow.org/v1\", ResourceVersion:\"287590271\", FieldPath:\"\"}): type: 'Warning' reason: 'FailedCreatePod' Error creating: pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:10Z"}
      {"filename":"tensorflow/controller.go:339","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Reconcile TFJobs experiment-1640073634042-0001","time":"2021-12-21T08:18:10Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/pod.go:80","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Need to create new pod: worker-0","replica-type":"worker","time":"2021-12-21T08:18:10Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:470","job":"submarine.experiment-1640073634042-0001","level":"warning","msg":"reconcilePods error pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:10Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:290","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Finished syncing tfjob \"submarine/experiment-1640073634042-0001\" (172.821236ms)","time":"2021-12-21T08:18:10Z"}
      E1221 08:18:10.753307       1 controller.go:266] error syncing tfjob: pods "experiment-1640073634042-0001-worker-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
      {"filename":"record/event.go:221","level":"info","msg":"Event(v1.ObjectReference{Kind:\"TFJob\", Namespace:\"submarine\", Name:\"experiment-1640073634042-0001\", UID:\"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4\", APIVersion:\"kubeflow.org/v1\", ResourceVersion:\"287590271\", FieldPath:\"\"}): type: 'Warning' reason: 'FailedCreatePod' Error creating: pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:10Z"}
      {"filename":"tensorflow/controller.go:339","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Reconcile TFJobs experiment-1640073634042-0001","time":"2021-12-21T08:18:11Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/pod.go:80","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Need to create new pod: worker-0","replica-type":"worker","time":"2021-12-21T08:18:11Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:470","job":"submarine.experiment-1640073634042-0001","level":"warning","msg":"reconcilePods error pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:11Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:290","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Finished syncing tfjob \"submarine/experiment-1640073634042-0001\" (73.171981ms)","time":"2021-12-21T08:18:11Z"}
      E1221 08:18:11.469567       1 controller.go:266] error syncing tfjob: pods "experiment-1640073634042-0001-worker-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
      {"filename":"record/event.go:221","level":"info","msg":"Event(v1.ObjectReference{Kind:\"TFJob\", Namespace:\"submarine\", Name:\"experiment-1640073634042-0001\", UID:\"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4\", APIVersion:\"kubeflow.org/v1\", ResourceVersion:\"287590271\", FieldPath:\"\"}): type: 'Warning' reason: 'FailedCreatePod' Error creating: pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:11Z"}
      {"filename":"tensorflow/job.go:130","level":"info","msg":"Updating tfjob: experiment-1640073634042-0001","time":"2021-12-21T08:18:12Z"}
      {"filename":"tensorflow/controller.go:339","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Reconcile TFJobs experiment-1640073634042-0001","time":"2021-12-21T08:18:12Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/pod.go:80","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Need to create new pod: worker-0","replica-type":"worker","time":"2021-12-21T08:18:12Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:470","job":"submarine.experiment-1640073634042-0001","level":"warning","msg":"reconcilePods error pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:12Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:290","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Finished syncing tfjob \"submarine/experiment-1640073634042-0001\" (36.756625ms)","time":"2021-12-21T08:18:12Z"}
      E1221 08:18:12.423556       1 controller.go:266] error syncing tfjob: pods "experiment-1640073634042-0001-worker-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
      {"filename":"record/event.go:221","level":"info","msg":"Event(v1.ObjectReference{Kind:\"TFJob\", Namespace:\"submarine\", Name:\"experiment-1640073634042-0001\", UID:\"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4\", APIVersion:\"kubeflow.org/v1\", ResourceVersion:\"287590271\", FieldPath:\"\"}): type: 'Warning' reason: 'FailedCreatePod' Error creating: pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:12Z"}
      {"filename":"tensorflow/controller.go:339","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Reconcile TFJobs experiment-1640073634042-0001","time":"2021-12-21T08:18:12Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/pod.go:80","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Need to create new pod: worker-0","replica-type":"worker","time":"2021-12-21T08:18:12Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:470","job":"submarine.experiment-1640073634042-0001","level":"warning","msg":"reconcilePods error pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:12Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:290","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Finished syncing tfjob \"submarine/experiment-1640073634042-0001\" (24.93148ms)","time":"2021-12-21T08:18:12Z"}
      E1221 08:18:12.785557       1 controller.go:266] error syncing tfjob: pods "experiment-1640073634042-0001-worker-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
      {"filename":"record/event.go:221","level":"info","msg":"Event(v1.ObjectReference{Kind:\"TFJob\", Namespace:\"submarine\", Name:\"experiment-1640073634042-0001\", UID:\"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4\", APIVersion:\"kubeflow.org/v1\", ResourceVersion:\"287590271\", FieldPath:\"\"}): type: 'Warning' reason: 'FailedCreatePod' Error creating: pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:12Z"}
      {"filename":"tensorflow/controller.go:339","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Reconcile TFJobs experiment-1640073634042-0001","time":"2021-12-21T08:18:17Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/pod.go:80","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Need to create new pod: worker-0","replica-type":"worker","time":"2021-12-21T08:18:17Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:470","job":"submarine.experiment-1640073634042-0001","level":"warning","msg":"reconcilePods error pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:17Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:290","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Finished syncing tfjob \"submarine/experiment-1640073634042-0001\" (22.903888ms)","time":"2021-12-21T08:18:17Z"}
      E1221 08:18:17.945442       1 controller.go:266] error syncing tfjob: pods "experiment-1640073634042-0001-worker-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
      {"filename":"record/event.go:221","level":"info","msg":"Event(v1.ObjectReference{Kind:\"TFJob\", Namespace:\"submarine\", Name:\"experiment-1640073634042-0001\", UID:\"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4\", APIVersion:\"kubeflow.org/v1\", ResourceVersion:\"287590271\", FieldPath:\"\"}): type: 'Warning' reason: 'FailedCreatePod' Error creating: pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:17Z"}
      {"filename":"tensorflow/controller.go:339","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Reconcile TFJobs experiment-1640073634042-0001","time":"2021-12-21T08:18:28Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/pod.go:80","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Need to create new pod: worker-0","replica-type":"worker","time":"2021-12-21T08:18:28Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:470","job":"submarine.experiment-1640073634042-0001","level":"warning","msg":"reconcilePods error pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:28Z","uid":"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4"}
      {"filename":"tensorflow/controller.go:290","job":"submarine.experiment-1640073634042-0001","level":"info","msg":"Finished syncing tfjob \"submarine/experiment-1640073634042-0001\" (183.401302ms)","time":"2021-12-21T08:18:28Z"}
      E1221 08:18:28.381347       1 controller.go:266] error syncing tfjob: pods "experiment-1640073634042-0001-worker-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
      {"filename":"record/event.go:221","level":"info","msg":"Event(v1.ObjectReference{Kind:\"TFJob\", Namespace:\"submarine\", Name:\"experiment-1640073634042-0001\", UID:\"cf89ad4f-bf54-42fa-b8ce-e30fc1c9b3d4\", APIVersion:\"kubeflow.org/v1\", ResourceVersion:\"287590271\", FieldPath:\"\"}): type: 'Warning' reason: 'FailedCreatePod' Error creating: pods \"experiment-1640073634042-0001-worker-0\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , \u003cnil\u003e","time":"2021-12-21T08:18:28Z"}
      
      

      I checked training-operator and search finalizers https://github.com/kubeflow/training-operator/search?q=finalizers

      The screenshot should prove that we need to supplement relevant resources when RBAC is created.

      Attachments

        Issue Links

          Activity

            People

              chenxiang cdmikechen
              chenxiang cdmikechen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: