ZenMl - feature request priorities - /s (strangemonad's notes)

updated: 2022-08-03 ## blocking bugs Currently none that we haven't worked around ## preventing production use - [ ] (re-)deploy scheduled pipelines from a CI job - [ ] related: better tracking of versions of a the same pipeline over time. One technique could be `<PIPELINE_NAME>-<DATETIME>-<SHA>` e.g. `<modeltrainig-20220801123060200-abcf423d` - [ ] Dynamic kubeflow pipeline steps. E.g. I have a job that needs to run a query and then based on the results, runs a step for each group or I have a large batch that I want to parallelize with up to n-concurrent jobs. KFP and argo pipelines support this natively (I believe since the first version of pipelines in kubeflow). I'm not sure how to do this with TFX components though. - [ ] [non-local Kubeflow Metadata store incorrectly reports not running](https://github.com/zenml-io/zenml/issues/756) (we've worked around it by writing a pid file) - [ ] [[FEATURE]: allow `slack_alerter` stack component to use secrets](https://github.com/zenml-io/zenml/issues/752) (we've worked around this but it's a hard coded secret name in a custom slack alerter so it sucks) - [x] scoped secrets (0.12.0) ## preventing broader adoption - [ ] [FEATURE]: return a reference to the pipeline run from the BasePipeline#run method](https://github.com/zenml-io/zenml/issues/726) - [ ] Publish but don't run a pipeline (so it can pick run from the KFP UI) - [ ] related: expose pipeline parameters as KFP pipeline arguments that an end user can pick - [ ] (no issue yet) easier out of box experience for data scientists and devs - How do I provide an experience where I just clone a repo and run a single command across multiple projects (without copying a shell script to every project) - [ ] (shawnmorel) I still need to flesh out this one out but we need a better way to streamline pulling in different configurations per environment. e.g. I already know that I'm running in dev or prod and I want to pick a different set of slack channels for the alerter () - [ ] KF config is different in-cluster and out of cluster - [ ] KF config requires a kubernetes_context ## nice to have - [ ] (We still need to provide a better design for this) [FEATURE]: Output protocols - Allow capturing specific types of zenml.steps.Output signatures](https://github.com/zenml-io/zenml/issues/686) - [ ] [[FEATURE]: Allow fetching from Repository by name or type](https://github.com/zenml-io/zenml/issues/727) [enhancement](https://github.com/zenml-io/zenml/issues?q=is%3Aopen+is%3Aissue+author%3Astrangemonad+label%3Aenhancement) - [ ] richer slack alerter output, use of slack blocks api. Currently we just have our own custom slack alerter. - [ ] kfp recurring run pipeline prefix doesn't have a `-` - [ ] Because of how everything is looked up and loaded dynamically, the `zenml` cli is starting to get really slow to use. ## unprioritized - [ ] need to better solve how dependencies are bundled - [ ] smaller kfp job payloads (related how dependencies are bundled) - [ ] pipelline_spec.pipeline_name = None (can't associate back cleanly) - [ ] build before deploying new schedule - e.g. we currently pause existing schedules before running new to prevent multiple concurrent pipelines. --- - Links: - Created at: [[2022-06-19]]