I have created a python script using Azure SDK to train a machine learning model and save an output .csv file. I created a python environment with all of the conda and pip dependencies needed to run the script and registered that environment to the workspace. However, when I go to run my script using the registered environment my experiments keep failing. The error message in the log that I cannot figure out states that the script fails because it cannot find the 'ipykernal' module when trying to import matplotlib. Here is the full text of error message:
[2021-04-15T22:52:41.716160] The experiment failed. Finalizing run...
[2021-04-15T22:52:41.716178] Start FinalizingInRunHistory
[2021-04-15T22:52:41.717507] Logging experiment finalizing status in history service.
Starting the daemon thread to refresh tokens in background for process with pid = 22305
Cleaning up all outstanding Run operations, waiting 300.0 seconds
1 items cleaning up...
Cleanup took 0.07227706909179688 seconds
Traceback (most recent call last):
File "prod_model.py", line 5, in <module>
import matplotlib.pyplot as plt
File "/home/azureuser/.azureml/envs/azureml_1525a6aa7633563e0d590fe86701d51d/lib/python3.7/site-packages/matplotlib/pyplot.py", line 2356, in <module>
switch_backend(rcParams["backend"])
File "/home/azureuser/.azureml/envs/azureml_1525a6aa7633563e0d590fe86701d51d/lib/python3.7/site-packages/matplotlib/pyplot.py", line 221, in switch_backend
backend_mod = importlib.import_module(backend_name)
File "/home/azureuser/.azureml/envs/azureml_1525a6aa7633563e0d590fe86701d51d/lib/python3.7/importlib/
init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'ipykernel'
[2021-04-15T22:52:42.232402] Finished context manager injector with Exception.
I have tried to import ipykernal as both a conda and pip package when creating the environment, but neither method is able to find the ipykernal package when I try to run the environment creation code. I have even tried to include the exact same version of every package that I have downloaded on my local machine (where the code runs without errors).
If anyone has any thoughts as to how to resolve this issue, I'd love to hear them. Thanks in advance for any help you can provide.
@Gregory Jacobs
I faced a similar error on my local machine when I tried to run an experiment on my jupyter notebook installation when the kernel that is used cannot find this module. I re-installed the package with the following command and the kernel worked fine later. Could you try the same in your environment to install the package with the user context?
python -m ipykernel install --user
Thanks for the response. I ran this command line within the Jupyter notebook environment that is running the experiment that calls the environment and script to train my model, and it looks like the ipykernal module installed fine but I am still getting the same error when I run the experiment. Do I need to install the ipykernel module within the environment that I registered with my workstation to run the experiment? I've tried to install ipykernal both as a conda dependency and a pip dependency, but neither we able to find that package.
UPDATE - I was able to resolve this issue by adding the ipykernal package as a dependency when building the python environment for my VM to run. The key was to load the ipykernel package into the environment before loading matplotlib - if I tried to load matplotlib first, it resulted in an error.
Thanks for your help!