Wiki source code of Using JupyterHub in HDC
Version 1.5 by Susan Evans on 2023/07/11 13:55
Hide last authors
author | version | line-number | content |
---|---|---|---|
![]() |
1.1 | 1 | JupyterHub is an open-source, multi-user version of Jupyter Notebook for performing analysis of Project files in the Core. More information can be found in the application documentation [[https:~~/~~/jupyter.org/>>https://jupyter.org/]]. |
2 | |||
3 | = How it Works = | ||
4 | |||
5 | JupyterHub allows Project members to create or import Jupyter Notebooks into the Project Workspace environment, retrieve Project files from the Core, perform computational workflows on the data, and write the outputs back to the Core where they can be accessed by other Project members. JupyterHub spins up a new JupyterLab instance for each Project member. | ||
6 | |||
7 | = Prerequisites = | ||
8 | |||
9 | * Project Collaborator role or higher. | ||
10 | * JupyterHub has been configured for the Project by the Platform Administrator. See //Getting Access to JupyterHub//. | ||
11 | |||
12 | = Data Stewardship = | ||
13 | |||
14 | Users are reminded to abide by the Platform Terms of Use and any Project-specific restrictions when using Workspace tools to access data and code. | ||
15 | |||
16 | = Getting Access to JupyterHub = | ||
17 | |||
18 | JupyterHub is configured at the time of Project Setup. If you launch JupyterHub and receive a notice that it hasn’t been deployed for your project, please contact your Platform Administrator. | ||
19 | |||
20 | = Launching JupyterHub = | ||
21 | |||
22 | [[image:HDC Project Workspace tool navigation Jupyterhub v1.0.0 2023-05-25.png||height="10%" width="30%"]] | ||
23 | |||
24 | 1. Launch your Project and click the **JupyterHub icon** in the left menu bar. | ||
25 | 1. Click **Sign in with Keycloak** to initiate your session. JupyterHub automatically authenticates with your existing username and password and launches your session - no additional sign-in is required. | ||
26 | 1. You can chose to either start a **Minimal environment**, which comes with Python, or a **Datascience environment**, which also includes R and Julia in addition to Python. | ||
27 | 1. From the JupyterHub home page (a JupyterLab interface) you can now perform various actions such as creating and working on Jupyter Notebooks, importing existing ones, and using the Pilot Command Line Interface in the terminal to retrieve, analyze, and re-upload Project Core data, and create. Moreover, you can also use the pre-deployed and configured package management software conda to download, install, and manage for instance Python packages as per individual demand (see the sections //Installing New Python Packages// and //Creating a Virtual Python Environment and Registering a Kernel// below for more details). | ||
28 | 1. When finished using JupyterHub, click **Logout** to end your session. | ||
29 | |||
30 | = Creating a Notebook = | ||
31 | |||
32 | Users can create a new Jupyter Notebook with Python 3 inside JupyterHub, with dedicated and persistent storage under the users' Home Directory. | ||
33 | |||
34 | 1. In the Launcher, click the **Python 3 Notebook **icon, or click **File > New > Notebook**. | ||
35 | 1. Create your Notebook. | ||
36 | |||
37 | [[image:Project Workspace Jupyter Create Python Notebook v2.1.6 2023-02-07.png||height="22%" width="50%"]] | ||
38 | |||
39 | = Launching the Terminal = | ||
40 | |||
41 | JupyterHub provides browser-based terminal access for advanced users to run commands directly in the system shell. Importantly, this allows users to sync data between for instance the Projects Core and their JupyterHub home directory using pilotcli, or to download and manage Python packages. | ||
42 | |||
43 | 1. In the Launcher, click the **Terminal **icon, or click **File > New > Terminal**. | ||
44 | 1. The terminal window opens. | ||
45 | |||
46 | [[image:Project Workspace Jupyter Launch Terminal v2.1.6 2023-02-07.png||height="9%" width="50%"]] | ||
47 | |||
48 | Ubuntu is used to host Jupyter Notebook. Use the command cat /etc/os-release to determine to current version of Ubuntu: | ||
49 | |||
50 | {{code language="none"}} | ||
51 | uname@jupyter-uname:/etc$ cat os-release | ||
52 | NAME="Ubuntu" | ||
53 | VERSION="20.04.4 LTS (Focal Fossa)" | ||
54 | ID=ubuntu | ||
55 | ID_LIKE=debian | ||
56 | PRETTY_NAME="Ubuntu 20.04.4 LTS" | ||
57 | VERSION_ID="20.04" | ||
58 | HOME_URL="https://www.ubuntu.com/" | ||
59 | SUPPORT_URL="https://help.ubuntu.com/" | ||
60 | BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" | ||
61 | PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" | ||
62 | VERSION_CODENAME=focal | ||
63 | UBUNTU_CODENAME=focal | ||
64 | {{/code}} | ||
65 | |||
66 | = Creating a Python Virtual Environment and Registering a Kernel = | ||
67 | |||
68 | The user has full flexibility to use different virtual environment and/or package management systems. Please find the examples of using conda or Pythons in-built venv options described below. Importantly, in either case, the user has to register the new environment as a kernel using ipykernel, to make is accessible via the Jupyter Notebooks (see //Registering the new Virtual Environment as Kernel// for more details). | ||
69 | |||
70 | == Using conda == | ||
71 | |||
72 | The package management software conda by Anaconda has become one of the most popular package management systems, especially for Data and Life Sciences. Therefore, conda is already pre-deployed and configured in each user’s JupyterHub. Please find the full documentation of conda [[here>>url:https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html]], and the corresponding documentation of how to manage virtual environments using conda [[here>>url:https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html]]. The following steps provide a short example of how you can use conda to create a new virtual environment using the JupyterHub terminal within the Platform. | ||
73 | |||
74 | At first, you need to activate conda. Since it is already pre-deployed and configured for you, all you need to do is launch a terminal within JupyterHub (see //Launching the Terminal// above) and execute the command {{code}}source activate{{/code}}. This will activate conda and you can see the success of this by the indication of the currently activated conda environment at the beginning of the line, displayed in parentheses - usually “base”: | ||
75 | |||
76 | {{code language="none"}} | ||
77 | username@jupyter-username:~$ source activate | ||
78 | (base) username@jupyter-username:~$ | ||
79 | {{/code}} | ||
80 | |||
81 | To create a new environment, run the following commands in the terminal after activating conda: | ||
82 | |||
83 | {{code language="none"}} | ||
84 | (base) username@jupyter-username:~$ conda create --name your_env_name | ||
85 | {{/code}} | ||
86 | |||
87 | Replace {{code}}your_env_name{{/code}} with your preferred name for the environment. When being prompted by conda to confirm the creation of the environment at the specified location (per default in the users home directory - please do not change this location, to ensure persistency of your created environment), proceed with the creation by typing “y”, or abort the process by typing “N”. Once confirmed, conda will complete the environment creation process and remind you to activate the environment: | ||
88 | |||
89 | {{code language="none"}} | ||
90 | (base) username@jupyter-username:~$ conda create --name sample_env | ||
91 | Collecting package metadata (current_repodata.json): done | ||
92 | Solving environment: done | ||
93 | |||
94 | ## Package Plan ## | ||
95 | |||
96 | environment location: /home/username/.conda_envs/sample_env | ||
97 | |||
98 | Proceed ([y]/n)? y | ||
99 | |||
100 | Preparing transaction: done | ||
101 | Verifying transaction: done | ||
102 | Executing transaction: done | ||
103 | # | ||
104 | # To activate this environment, use | ||
105 | # | ||
106 | # $ conda activate sample_env | ||
107 | # | ||
108 | # To deactivate an active environment, use | ||
109 | # | ||
110 | # $ conda deactivate | ||
111 | |||
112 | (base) username@jupyter-username:~$ | ||
113 | {{/code}} | ||
114 | |||
115 | Please note, at the end of the environment creation process, you will still remain in the previously activate environment (“base”, in this example). Therefore, please remember to activate the novel environment before installing any packages by running the command {{code}}conda activate your_env_name{{/code}} and replace “your_env_name” with the corresponding name you chose (“sample_env” in this example): | ||
116 | |||
117 | {{code language="none"}} | ||
118 | (base) username@jupyter-username:~$ conda activate sample_env | ||
119 | (/home/username/.conda_envs/sample_env) username@jupyter-username:~$ | ||
120 | {{/code}} | ||
121 | |||
122 | You can now install the desired packages in this new conda environment, for instance using the {{code}}conda install{{/code}} command. For example, in order to install the latest version of Python, run: | ||
123 | |||
124 | {{code language="none"}} | ||
125 | (/home/username/.conda_envs/sample_env) username@jupyter-username:~$ conda install python | ||
126 | {{/code}} | ||
127 | |||
128 | To see a list of all installed packages in the currently activated environment (indicated in parentheses at the beginning of the line, “base” in this case), run: | ||
129 | |||
130 | {{code language="none"}} | ||
131 | (base) username@jupyter-username:~$ conda list | ||
132 | {{/code}} | ||
133 | |||
134 | To see a list of all existing conda environments, run: | ||
135 | |||
136 | {{code language="none"}} | ||
137 | (base) username@jupyter-username:~$ conda info --envs | ||
138 | {{/code}} | ||
139 | |||
140 | Please find many more examples and the full documentation of how to manage conda environments [[here>>url:https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#]]. Importantly, please remember to follow the instructions in the //Registering the new Virtual Environment as Kernel// section below, to make the virtual environment accessible via the Jupyter Notebooks. | ||
141 | |||
142 | == Using venv == | ||
143 | |||
144 | As an alternative to using conda, you can also use the Python native package venv. Please find the full documentation of venv [[here>>url:https://docs.python.org/3/library/venv.html#]], and a short example of how to create a new virtual environment using venv below: | ||
145 | |||
146 | {{code language="none"}} | ||
147 | username@jupyter-username:~$ python3 -m venv your_env_name | ||
148 | username@jupyter-username:~$ source your_env_name/bin/activate | ||
149 | {{/code}} | ||
150 | |||
151 | == Registering the new Virtual Environment as Kernel == | ||
152 | |||
153 | In order to make the newly created virtual environment accessible for the Jupyter Notebooks, you have to register it using ipykernel. Importantly, please make sure that the corresponding environment is currently active before running the following commands: | ||
154 | |||
155 | {{code language="none"}} | ||
156 | username@jupyter-username:~$ python -m ipykernel install --user --name=your_env_name | ||
157 | {{/code}} | ||
158 | |||
159 | Please replace {{code}}your_env_name{{/code}} with the name of your newly created environment. Depending on which package and/or virtual environment management system you chose to use, you may have to install ipykernel in the newly created environment first. Remember to activate the newly created environment and then run one of the following commands to install ipykernel, depending on your package management system of choice: | ||
160 | |||
161 | {{code language="none"}} | ||
162 | (your_env_name) username@jupyter-username:~$ conda install -c anaconda ipykernel | ||
163 | {{/code}} | ||
164 | |||
165 | or: | ||
166 | |||
167 | {{code language="none"}} | ||
168 | username@jupyter-username:~$ pip install ipykernel | ||
169 | {{/code}} | ||
170 | |||
171 | Once you have installed ipykernel, re-run the command above to register your novel environment via ipykernel. | ||
172 | |||
173 | **Example usage:** | ||
174 | |||
175 | {{code language="none"}} | ||
176 | (/home/username/.conda_envs/sample_env) username@jupyter-username:~$ python -m ipykernel install --user --name=sample_env | ||
177 | Installed kernelspec sample_env in /home/username/.local/share/jupyter/kernels/sample_env | ||
178 | (/home/username/.conda_envs/sample_env) username@jupyter-username:~$ | ||
179 | {{/code}} | ||
180 | |||
181 | Afterwards, the environment will be listed when you open the Launcher to open a new Jupyter Notebook: | ||
182 | |||
![]() |
1.2 | 183 | [[image:Project Workspace Jupyter view new Kernel 2023-07-11.png||height="25%" width="50%"]] |
![]() |
1.1 | 184 | |
185 | |||
![]() |
1.2 | 186 | and also from each opened Notebook, e.g., via **Kernel > Change Kernel…** : |
![]() |
1.1 | 187 | |
![]() |
1.2 | 188 | [[image:Project Workspace Jupyter Kernel change Kernel dropdown 2023-07-11.png||height="64%" width="50%"]] |
![]() |
1.1 | 189 | |
![]() |
1.2 | 190 | = Installing New Python Packages = |
191 | |||
192 | We highly recommend the use of virtual environments when installing new packages (see //Creating a Python Virtual Environment and Registering a Kernel// above for more details). Consequently, we recommend installing new packages via commands in the JupyterHub terminal in the corresponding virtual environments, instead of installing packages from within Jupyter Notebooks. | ||
193 | |||
194 | Depending on the IT policies, outbound traffic may need to go through a proxy. If so, users will be required to provide the proxy command line argument such as pip, curl, wget, etc. | ||
195 | |||
196 | For example: | ||
197 | |||
198 | {{code language="none"}} | ||
199 | pip install my_package | ||
200 | {{/code}} | ||
201 | |||
202 | If you are using conda to manage python packages: | ||
203 | |||
204 | {{code language="none"}} | ||
205 | conda install my_package | ||
206 | {{/code}} | ||
207 | |||
208 | The above information is provided as examples only. Please refer to documentation provided by your IT department with respect to proxy configuration. | ||
209 | |||
210 | = Using the Pilot Command Line Interface in a JupyterHub Terminal = | ||
211 | |||
212 | The Pilot Command Line Interface (CLI) is deployed within JupyterHub as extension resource. Project members can use the Pilot Command Line Interface in a JupyterHub terminal to download Project data from the Core for further analysis, and upload the derivative outputs back to the Green Room or Core. | ||
213 | |||
214 | The Home Directory is your default directory. When you download a copy of your Core files to JupyterHub, the files persists in the JupyterHub environment until deleted by you, so you can return to the session and continue your work at a later time without the need to retrieve the data from the Core again. | ||
215 | |||
216 | The following sections focus on getting started with basic pilotcli commands in JupyterHub. For additional pilotcli commands and usage, see the article //Working with HDC Project Files in the Command Line Interface//. | ||
217 | |||
218 | == Launching Pilot Command Line Interface == | ||
219 | |||
220 | 1. Launch your Project and click the **JupyterHub** icon in in the workspace icon group. | ||
221 | 1. Click the **Terminal **launcher icon to open the Terminal. | ||
222 | 1. In the Jupyterhub Terminal, type {{code}}pilotcli{{/code}} to launch the latest version of the Pilot Command Line Interface. | ||
223 | 1. Use the {{code}}pilotcli --help{{/code}} at any time to show the welcome message again. | ||
224 | |||
![]() |
1.3 | 225 | {{code language="none"}} |
226 | collaborator4@jupyter-collaborator4:~$ pilotcli | ||
227 | Usage: pilotcli [OPTIONS] COMMAND [ARGS]... | ||
![]() |
1.2 | 228 | |
![]() |
1.3 | 229 | What's new (Version 2.2.0): |
230 | |||
231 | 1. CLI supports to perform multi-threading upload for file/folders | ||
232 | |||
233 | 2. CLI supports to perform resumable upload for single file | ||
234 | |||
235 | |||
236 | |||
237 | Options: | ||
238 | --help Show this message and exit. | ||
239 | |||
240 | Commands: | ||
241 | container_registry Container Registry Actions. | ||
242 | dataset Dataset Actions. | ||
243 | file File Actions. | ||
244 | project Project Actions. | ||
245 | use_config Config Actions. | ||
246 | user User Actions. | ||
247 | {{/code}} | ||
248 | |||
249 | == Logging into the Pilot Command Line Interface == | ||
250 | |||
251 | Users are required to login with platform credentials before performing any tasks through Pilot Command Line Interface. | ||
252 | |||
253 | * Use the command {{code}}pilotcli user login{{/code}} to log into the Pilot Command Line Interface. | ||
254 | |||
255 | {{code language="none"}} | ||
256 | collaborator4@jupyter-collaborator4:~$ pilotcli user login | ||
257 | Please, access https://iam.staging.pilot.indocresearch.com/realms/pilot/device?user_code=XXXX-XXXX to proceed | ||
258 | ▄▄▄▄▄▄▄ ▄ ▄▄ ▄ ▄▄▄▄ ▄ ▄▄▄▄▄▄▄ | ||
259 | █ ▄▄▄ █ ▄ ▄███ ▀▀ █▀ ▀██▄ █ ▄▄▄ █ | ||
260 | █ ▄ ▀ ▄ ▀▄ ▀▀ ▄█▀▄▀ ▀▀▄█▄▄▀ █████▄▄▀▄ | ||
261 | ▄▄▄▄▄▄▄ ▀ ▀█▄ ▀▄ ██▀█ ▄▀▄▄ █ ▄ █▀▄▄▄ | ||
262 | █ ▄▄▄ █ █▀█▄▀ █▀ █▀▀█ ▀▄█▄█▄▄▄█▀▄█ | ||
263 | █ ███ █ █▀██▀▄ █▀▄▄▀▀█▄▀▀█▄▀█ ▀ ▀▄▀██ | ||
264 | █▄▄▄▄▄█ ▄▀▄▄██▄▄▀▄ ▀▀▄ ▄▄▀▀▀▄ █▄▄▄█ | ||
265 | |||
266 | Waiting validation finish... | ||
267 | {{/code}} | ||
268 | |||
269 | * ((( | ||
270 | You’ll be asked to validate your HDC user account using one of the provided methods. | ||
271 | |||
272 | * Copy and paste the provided validation link into a new browser tab or | ||
273 | * Scan the QR code with your mobile device. | ||
274 | ))) | ||
275 | * Open the login window and enter your HDC username and password (i.e. your EBRAINS account credentials). | ||
276 | * Grant access by clicking **Yes**. | ||
277 | |||
278 | [[image:Pilotcli Jupyter user login Grant Access window v2.4.0 2023-05-25.png||height="46%" width="50%"]] | ||
279 | |||
280 | [[image:Pilotcli Jupyter user login Device Login Successful v2.4.0 2023-05-25.png||height="16%" width="50%"]] | ||
281 | |||
282 | * After successful confirmation, return to the terminal in your JupyterHub browser tab. | ||
283 | |||
284 | {{code language="none"}} | ||
285 | Welcome to the Command Line Tool! | ||
286 | {{/code}} | ||
287 | |||
288 | * You’re now ready to start using the Pilot Command Line Interface to work with your Project data in JupyterHub. | ||
289 | |||
290 | == Zone Restrictions when using Pilot Command Line Interface in JupyterHub == | ||
291 | |||
292 | When using the Pilot Command Line Interface in JupyterHub and the following actions are possible on the derivative files generated in JupyterHub: | ||
293 | |||
294 | |=(% colspan="1" rowspan="1" %)((( | ||
295 | **File Operation** | ||
296 | )))|=(% colspan="1" rowspan="1" %)((( | ||
297 | **Permitted in the ** | ||
![]() |
1.4 | 298 | **Green Room** |
![]() |
1.3 | 299 | )))|=(% colspan="1" rowspan="1" %)((( |
300 | **Permitted in the ** | ||
![]() |
1.4 | 301 | **Core** |
![]() |
1.3 | 302 | ))) |
303 | |(% colspan="1" rowspan="1" %)File upload | ||
304 | (upload derivative output files from JupyterHub to the Green Room or Core storage)|(% colspan="1" rowspan="1" %)((( | ||
305 | Yes | ||
306 | )))|(% colspan="1" rowspan="1" %)((( | ||
307 | Yes | ||
308 | ))) | ||
309 | |(% colspan="1" rowspan="1" %)File download | ||
310 | (download files from Green Room or Core into JupyterHub)|(% colspan="1" rowspan="1" %)((( | ||
311 | **No** | ||
312 | )))|(% colspan="1" rowspan="1" %)((( | ||
313 | Yes | ||
314 | ))) | ||
315 | |||
316 | |||
![]() |
1.5 | 317 | |
318 | |||
319 |