Machine learning with Python
The cluster offers ready-to-use Python environments including a set of libraries and tools adapted to data analysis and machine learning:
- Data loading: pandas, numpy
- Visualization: matplotlib, Seaborn, Altair, Plotly, Bokeh
- Tensorflow and tensorboard
- PyTorch
- Scikit-learn
Access#
From Jupyter#
You can access these environments from JupyterHub by choosing the Python 3.7 or 3.9 kernel.
From the Unix / SLURM shell#
You can access these environments from module: module load python/3.7
or module load python/3.9
.
Tensorflow and Tensorboard in notebooks#
By default, Tensorflow logs all information. To disable warning or error logs, you can change the value of the environment variable TF_CPP_MIN_LOG_LEVEL
:
# disable tensorflow debug message
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# 0 = all messages are logged (default behavior)
# 1 = INFO messages are not printed
# 2 = INFO and WARNING messages are not printed
# 3 = INFO, WARNING, and ERROR messages are not printed
A iPython magic command allows the integration of Tensorboard directly into notebooks. In order for this integration to work on JupyterHub, you need to set the environment variable TENSORBOARD_PROXY_URL
to tell Jupyter that it needs to access Tensorboard through the JupyterHub proxy. To do this, simply add this cell to your notebook before calling Tensorboard command:
# Set proxy fro tensorboard access through JupyterHub
import os
os.environ['TENSORBOARD_PROXY_URL'] = f"/user/{os.environ.get('USER')}/proxy/%PORT%/"
Sample notebooks#
The following sample notebooks have been tested on the cluster with Python 3.7 and 3.9