Wiki source code of Using JupyterHub in HDC

Last modified by Dennis Segebarth on 2026/02/26 12:16

Show last authors
1 {{box cssClass="floatinginfobox" title="Table of Contents"}}
2 {{toc depth="2"/}}
3 {{/box}}
4
5 {{warning}}
6 **Disclaimer:** This article may contain screenshots from other Pilot-based deployments. Differences only apply to variations in branding and color schemes, not the functionalities covered in this article, though.
7 {{/warning}}
8
9
10 JupyterHub is an open-source, multi-user version of Jupyter Notebook for performing analysis of Project files in the Core. More information can be found in the application documentation [[https:~~/~~/jupyter.org/>>https://jupyter.org/]].
11
12 [[image:1723798240534-407.png||height="189" width="291"]]
13
14
15 = How it Works =
16
17 JupyterHub allows Project members to create or import Jupyter Notebooks into the Project Workspace environment, retrieve Project files from the Core, perform computational workflows on the data, and write the outputs back to the Core where they can be accessed by other Project members. JupyterHub spins up a new JupyterLab instance for each Project member.
18
19 = Prerequisites =
20
21 * Project Collaborator role or higher.
22 * JupyterHub has been configured for the Project by the Platform Administrator. See //Getting Access to JupyterHub//.
23
24 = Data Stewardship =
25
26 Users are reminded to abide by the Platform Terms of Use and any Project-specific restrictions when using Workspace tools to access data and code.
27
28 = Getting Access to JupyterHub =
29
30 JupyterHub is configured at the time of Project Setup. If you launch JupyterHub and receive a notice that it hasn’t been deployed for your project, please contact your Platform Administrator.
31
32 {{info}}
33 If you access JupyterHub of the HealthDataCloud Test Project, please be aware that the resources are limited for each user to: 2 GB of persistent storage volume, 4 GB memory, and a single CPU. These limitations can easily be adjusted for new Projects.
34 {{/info}}
35
36 = Launching JupyterHub =
37
38 [[image:1723798257792-201.png||height="121" width="349"]]
39
40 1. Launch your Project and click the **JupyterHub icon** in the left menu bar.
41 1. Click **Sign in with Keycloak** to initiate your session. JupyterHub automatically authenticates with your existing username and password and launches your session - no additional sign-in is required.
42 1. You can chose to either start a **Minimal environment**, which comes with Python, or a **Datascience environment**, which also includes R and Julia in addition to Python.
43 1. From the JupyterHub home page (a JupyterLab interface) you can now perform various actions such as creating and working on Jupyter Notebooks, importing existing ones, and using the Pilot Command Line Interface in the terminal to retrieve, analyze, and re-upload Project Core data, and create. Moreover, you can also use the pre-deployed and configured package management software conda to download, install, and manage for instance Python packages as per individual demand (see the sections //Installing New Python Packages// and //Creating a Virtual Python Environment and Registering a Kernel// below for more details).
44 1. When finished using JupyterHub, click **Logout** to end your session.
45
46 = Creating a Notebook =
47
48 Users can create a new Jupyter Notebook with Python 3 inside JupyterHub, with dedicated and persistent storage under the users' Home Directory.
49
50 1. In the Launcher, click the **Python 3 Notebook **icon, or click **File > New > Notebook**.
51 1. Create your Notebook.
52
53 [[image:1723798278604-114.png||height="376" width="865"]]
54
55 = Launching the Terminal =
56
57 JupyterHub provides browser-based terminal access for advanced users to run commands directly in the system shell. Importantly, this allows users to sync data between for instance the Projects Core and their JupyterHub home directory using pilotcli, or to download and manage Python packages.
58
59 1. In the Launcher, click the **Terminal **icon, or click **File > New > Terminal**.
60 1. The terminal window opens.
61
62 [[image:1723798293872-992.png||height="162" width="863"]]
63
64 Ubuntu is used to host Jupyter Notebook. Use the command cat /etc/os-release to determine to current version of Ubuntu:
65
66 {{code language="none"}}
67 uname@jupyter-uname:/etc$ cat os-release
68 NAME="Ubuntu"
69 VERSION="20.04.4 LTS (Focal Fossa)"
70 ID=ubuntu
71 ID_LIKE=debian
72 PRETTY_NAME="Ubuntu 20.04.4 LTS"
73 VERSION_ID="20.04"
74 HOME_URL="https://www.ubuntu.com/"
75 SUPPORT_URL="https://help.ubuntu.com/"
76 BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
77 PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
78 VERSION_CODENAME=focal
79 UBUNTU_CODENAME=focal
80 {{/code}}
81
82 = Creating a Python Virtual Environment and Registering a Kernel =
83
84 The user has full flexibility to use different virtual environment and/or package management systems. Please find the examples of using conda or Pythons in-built venv options described below. Importantly, in either case, the user has to register the new environment as a kernel using ipykernel, to make is accessible via the Jupyter Notebooks (see //Registering the new Virtual Environment as Kernel// for more details).
85
86 == Using conda ==
87
88 The package management software conda by Anaconda has become one of the most popular package management systems, especially for Data and Life Sciences. Therefore, conda is already pre-deployed and configured in each user’s JupyterHub. Please find the full documentation of conda [[here>>url:https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html]], and the corresponding documentation of how to manage virtual environments using conda [[here>>url:https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html]]. The following steps provide a short example of how you can use conda to create a new virtual environment using the JupyterHub terminal within the Platform.
89
90 At first, you need to activate conda. Since it is already pre-deployed and configured for you, all you need to do is launch a terminal within JupyterHub (see //Launching the Terminal// above) and execute the command {{code}}source activate{{/code}}. This will activate conda and you can see the success of this by the indication of the currently activated conda environment at the beginning of the line, displayed in parentheses - usually “base”:
91
92 {{code language="none"}}
93 username@jupyter-username:~$ source activate
94 (base) username@jupyter-username:~$
95 {{/code}}
96
97 To create a new environment, run the following commands in the terminal after activating conda:
98
99 {{code language="none"}}
100 (base) username@jupyter-username:~$ conda create --name your_env_name
101 {{/code}}
102
103 Replace {{code}}your_env_name{{/code}} with your preferred name for the environment. When being prompted by conda to confirm the creation of the environment at the specified location (per default in the users home directory - please do not change this location, to ensure persistency of your created environment), proceed with the creation by typing “y”, or abort the process by typing “N”. Once confirmed, conda will complete the environment creation process and remind you to activate the environment:
104
105 {{code language="none"}}
106 (base) username@jupyter-username:~$ conda create --name sample_env
107 Collecting package metadata (current_repodata.json): done
108 Solving environment: done
109
110 ## Package Plan ##
111
112 environment location: /home/username/.conda_envs/sample_env
113
114 Proceed ([y]/n)? y
115
116 Preparing transaction: done
117 Verifying transaction: done
118 Executing transaction: done
119 #
120 # To activate this environment, use
121 #
122 # $ conda activate sample_env
123 #
124 # To deactivate an active environment, use
125 #
126 # $ conda deactivate
127
128 (base) username@jupyter-username:~$
129 {{/code}}
130
131 Please note, at the end of the environment creation process, you will still remain in the previously activate environment (“base”, in this example). Therefore, please remember to activate the novel environment before installing any packages by running the command {{code}}conda activate your_env_name{{/code}} and replace “your_env_name” with the corresponding name you chose (“sample_env” in this example):
132
133 {{code language="none"}}
134 (base) username@jupyter-username:~$ conda activate sample_env
135 (/home/username/.conda_envs/sample_env) username@jupyter-username:~$
136 {{/code}}
137
138 You can now install the desired packages in this new conda environment, for instance using the {{code}}conda install{{/code}} command. For example, in order to install the latest version of Python, run:
139
140 {{code language="none"}}
141 (/home/username/.conda_envs/sample_env) username@jupyter-username:~$ conda install python
142 {{/code}}
143
144 To see a list of all installed packages in the currently activated environment (indicated in parentheses at the beginning of the line, “base” in this case), run:
145
146 {{code language="none"}}
147 (base) username@jupyter-username:~$ conda list
148 {{/code}}
149
150 To see a list of all existing conda environments, run:
151
152 {{code language="none"}}
153 (base) username@jupyter-username:~$ conda info --envs
154 {{/code}}
155
156 Please find many more examples and the full documentation of how to manage conda environments [[here>>url:https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#]]. Importantly, please remember to follow the instructions in the //Registering the new Virtual Environment as Kernel// section below, to make the virtual environment accessible via the Jupyter Notebooks.
157
158 == Using venv ==
159
160 As an alternative to using conda, you can also use the Python native package venv. Please find the full documentation of venv [[here>>url:https://docs.python.org/3/library/venv.html#]], and a short example of how to create a new virtual environment using venv below:
161
162 {{code language="none"}}
163 username@jupyter-username:~$ python3 -m venv your_env_name
164 username@jupyter-username:~$ source your_env_name/bin/activate
165 {{/code}}
166
167 == Registering the new Virtual Environment as Kernel ==
168
169 In order to make the newly created virtual environment accessible for the Jupyter Notebooks, you have to register it using ipykernel. Importantly, please make sure that the corresponding environment is currently active before running the following commands:
170
171 {{code language="none"}}
172 username@jupyter-username:~$ python -m ipykernel install --user --name=your_env_name
173 {{/code}}
174
175 Please replace {{code}}your_env_name{{/code}} with the name of your newly created environment. Depending on which package and/or virtual environment management system you chose to use, you may have to install ipykernel in the newly created environment first. Remember to activate the newly created environment and then run one of the following commands to install ipykernel, depending on your package management system of choice:
176
177 {{code language="none"}}
178 (your_env_name) username@jupyter-username:~$ conda install -c anaconda ipykernel
179 {{/code}}
180
181 or:
182
183 {{code language="none"}}
184 username@jupyter-username:~$ pip install ipykernel
185 {{/code}}
186
187 Once you have installed ipykernel, re-run the command above to register your novel environment via ipykernel.
188
189 **Example usage:**
190
191 {{code language="none"}}
192 (/home/username/.conda_envs/sample_env) username@jupyter-username:~$ python -m ipykernel install --user --name=sample_env
193 Installed kernelspec sample_env in /home/username/.local/share/jupyter/kernels/sample_env
194 (/home/username/.conda_envs/sample_env) username@jupyter-username:~$
195 {{/code}}
196
197 Afterwards, the environment will be listed when you open the Launcher to open a new Jupyter Notebook:
198
199 [[image:1723798325144-485.png||height="436" width="867"]]
200
201
202 and also from each opened Notebook, e.g., via **Kernel > Change Kernel…** :
203
204 [[image:1723798338447-469.png||height="317" width="247"]]
205
206 = Installing New Python Packages =
207
208 We highly recommend the use of virtual environments when installing new packages (see //Creating a Python Virtual Environment and Registering a Kernel// above for more details). Consequently, we recommend installing new packages via commands in the JupyterHub terminal in the corresponding virtual environments, instead of installing packages from within Jupyter Notebooks.
209
210 Depending on the IT policies, outbound traffic may need to go through a proxy. If so, users will be required to provide the proxy command line argument such as pip, curl, wget, etc.
211
212 For example:
213
214 {{code language="none"}}
215 pip install my_package
216 {{/code}}
217
218 If you are using conda to manage python packages:
219
220 {{code language="none"}}
221 conda install my_package
222 {{/code}}
223
224 The above information is provided as examples only. Please refer to documentation provided by your IT department with respect to proxy configuration.
225
226 = Using the Pilot Command Line Interface in a JupyterHub Terminal =
227
228 The Pilot Command Line Interface (CLI) is deployed within JupyterHub as extension resource. Project members can use the Pilot Command Line Interface in a JupyterHub terminal to download Project data from the Core for further analysis, and upload the derivative outputs back to the Green Room or Core.
229
230 The Home Directory is your default directory. When you download a copy of your Core files to JupyterHub, the files persists in the JupyterHub environment until deleted by you, so you can return to the session and continue your work at a later time without the need to retrieve the data from the Core again.
231
232 The following sections focus on getting started with basic pilotcli commands in JupyterHub. For additional pilotcli commands and usage, see the article //Working with HDC Project Files in the Command Line Interface//.
233
234 == Launching Pilot Command Line Interface ==
235
236 1. Launch your Project and click the **JupyterHub** icon in in the workspace icon group.
237 1. Click the **Terminal **launcher icon to open the Terminal.
238 1. In the Jupyterhub Terminal, type {{code}}pilotcli{{/code}} to launch the latest version of the Pilot Command Line Interface.
239 1. Use the {{code}}pilotcli --help{{/code}} at any time to show the welcome message again.
240
241 {{code language="none"}}
242 collaborator4@jupyter-collaborator4:~$ pilotcli
243 Usage: pilotcli [OPTIONS] COMMAND [ARGS]...
244
245 What's new (Version 2.2.0):
246
247 1. CLI supports to perform multi-threading upload for file/folders
248
249 2. CLI supports to perform resumable upload for single file
250
251
252
253 Options:
254 --help Show this message and exit.
255
256 Commands:
257 container_registry Container Registry Actions.
258 dataset Dataset Actions.
259 file File Actions.
260 project Project Actions.
261 use_config Config Actions.
262 user User Actions.
263 {{/code}}
264
265 == Logging into the Pilot Command Line Interface ==
266
267 Users are required to login with platform credentials before performing any tasks through Pilot Command Line Interface.
268
269 * Use the command {{code}}pilotcli user login{{/code}} to log into the Pilot Command Line Interface.
270
271 {{code language="none"}}
272 collaborator4@jupyter-collaborator4:~$ pilotcli user login
273 Please, access https://iam.staging.pilot.indocresearch.com/realms/pilot/device?user_code=XXXX-XXXX to proceed
274 ▄▄▄▄▄▄▄ ▄ ▄▄ ▄ ▄▄▄▄ ▄ ▄▄▄▄▄▄▄
275 █ ▄▄▄ █ ▄ ▄███ ▀▀ █▀ ▀██▄ █ ▄▄▄ █
276 █ ▄ ▀ ▄ ▀▄ ▀▀ ▄█▀▄▀ ▀▀▄█▄▄▀ █████▄▄▀▄
277 ▄▄▄▄▄▄▄ ▀ ▀█▄ ▀▄ ██▀█ ▄▀▄▄ █ ▄ █▀▄▄▄
278 █ ▄▄▄ █ █▀█▄▀ █▀ █▀▀█ ▀▄█▄█▄▄▄█▀▄█
279 █ ███ █ █▀██▀▄ █▀▄▄▀▀█▄▀▀█▄▀█ ▀ ▀▄▀██
280 █▄▄▄▄▄█ ▄▀▄▄██▄▄▀▄ ▀▀▄ ▄▄▀▀▀▄ █▄▄▄█
281
282 Waiting validation finish...
283 {{/code}}
284
285 * (((
286 You’ll be asked to validate your HDC user account using one of the provided methods.
287
288 * Copy and paste the provided validation link into a new browser tab or
289 * Scan the QR code with your mobile device.
290 )))
291 * Open the login window and enter your HDC username and password (i.e. your EBRAINS account credentials).
292 * Grant access by clicking **Yes**.
293
294 [[image:1723798355215-434.png||height="352" width="379"]]
295
296 [[image:1723798365454-527.png||height="123" width="376"]]
297
298 * After successful confirmation, return to the terminal in your JupyterHub browser tab.
299
300 {{code language="none"}}
301 Welcome to the Command Line Tool!
302 {{/code}}
303
304 * You’re now ready to start using the Pilot Command Line Interface to work with your Project data in JupyterHub.
305
306 == Zone Restrictions when using Pilot Command Line Interface in JupyterHub ==
307
308 When using the Pilot Command Line Interface in JupyterHub and the following actions are possible on the derivative files generated in JupyterHub:
309
310 |=(% colspan="1" rowspan="1" %)(((
311 **File Operation**
312 )))|=(% colspan="1" rowspan="1" %)(((
313 **Permitted in the **
314 **Green Room**
315 )))|=(% colspan="1" rowspan="1" %)(((
316 **Permitted in the **
317 **Core**
318 )))
319 |(% colspan="1" rowspan="1" %)File upload 
320 (upload derivative output files from JupyterHub to the Green Room or Core storage)|(% colspan="1" rowspan="1" %)(((
321 Yes
322 )))|(% colspan="1" rowspan="1" %)(((
323 Yes
324 )))
325 |(% colspan="1" rowspan="1" %)File download
326 (download files from Green Room or Core into JupyterHub)|(% colspan="1" rowspan="1" %)(((
327 **No**
328 )))|(% colspan="1" rowspan="1" %)(((
329 Yes
330 )))
331
332 == Downloading Project Data to JupyterHub using the Pilot Command Line Interface ==
333
334 After logging into the Pilot Command Line Interface, you can download data from the Project Core into the JupyterHub environment to start your data analyses.
335
336 File related commands are grouped in the {{code}}file{{/code}} category. To view the full list of commands in this category, type {{code}}pilotcli file --help{{/code}}. To download project data, use the file sync command. To view the full list of commands in this category, type {{code}}pilotcli file sync --help{{/code}}.
337
338
339 {{code language="none"}}
340 collaborator4@jupyter-collaborator4:~$ pilotcli file sync --help
341 Usage: pilotcli file sync [OPTIONS] [PATHS]... OUTPUT_PATH
342
343 Download files/folders from a given Project/folder/file in core zone.
344
345 Options:
346 -z, --zone TEXT Target Zone (i.e., core/greenroom)
347 --zip Download files as a zip.
348 -i, --geid Enable downloading by geid.
349 --help Show this message and exit.
350 {{/code}}
351
352 === Example ===
353
354 Downloading a file from the Core to your Home Directory:
355
356 Reminder: Please follow Linux conventions for file management. If your filename contains spaces, wrap it in single or double quotes.
357
358 * //Filename~:// “Chemical Tracking Data.csv”
359 * //Source~:// Project “Indoc Test Project”, “Core” storage zone, folder “collaborator4” {{code}}indoctestproject/collaborator4/Chemical Tracking Data.csv -z core{{/code}}
360 * //Destination: //user's Home directory in the Guacamole or JupyterHub VM {{code}}.{{/code}}
361 * //Command group/option: //{{code}}file sync{{/code}}
362
363 {{code language="none"}}
364 collaborator4@jupyter-collaborator4:~$ pilotcli file sync indoctestproject/collaborator4/'Chemical Tracking Data.csv' . -z core
365 start downloading...
366 Downloading Chemical Tracking Data.csv |██████████████████████████████ 100% 00:00
367 File has been downloaded successfully and saved to: ./Chemical Tracking Data.csv
368 {{/code}}
369
370 To confirm successful download, type {{code}}ls{{/code}} and verify the file "Chemical Tracking Data.csv" is stored in the Home folder.
371
372 {{code language="none"}}
373 collaborator4@jupyter-collaborator4:~$ ls
374 'Chemical Tracking Data.csv' pilotcli
375 {{/code}}
376
377 The file “Chemical Tracking Data.csv” can be viewed in the JupyterHub graphical user interface:
378
379 [[image:1723798383409-873.png||height="267" width="874"]]
380
381
382 == Uploading Project Data from JupyterHub using the Pilot Command Line Interface ==
383
384 After analyzing Project data inside the JupyterHub, you can upload the generated outputs back into the Project via the Pilot Command Line Interface.
385
386 === Example ===
387
388 * //Filename//: Chemical Tracking Data rev.csv
389 * //Source~:// user's Home directory in JupyterHub {{code}}.{{/code}}
390 * //Destination//: Project “Indoc Test Project”, folder “collaborator4”, “Core” storage zone,
391 {{code}}indoctestproject/collaborator4{{/code}} {{code}}-z core{{/code}}
392 * //Command group/option~:// {{code}}file upload{{/code}}
393 * //User message// (for upload back to the Core): “my workbench output, no additional sensitive data"
394 * //Command~:// {{code}}pilotcli file upload ./'Chemical Tracking Data rev.csv' -p{{/code}} {{code}}indoctestproject/collaborator4 -z core -m "my workbench output, no additional sensitive data"{{/code}}
395
396 When uploading data to the Core, you are reminded that you are bypassing the usual Green Room upload workflow. To confirm, type {{code}}y{{/code}} at the prompt, or {{code}}N{{/code}} to cancel.
397
398 {{code language="none"}}
399 collaborator4@jupyter-collaborator4:~$ pilotcli file upload ./'Chemical Tracking Data rev.csv' -p indoctestproject/collaborator4 -z core -m "my workbench output, no additional sensitive data"
400 You are about to transfer data directly to the PILOT Core! In accordance with the PILOT Terms of Use, please confirm that you have made your best efforts to
401 pseudonymize or anonymize the data and that you have the legal authority to transfer and make this data available for dissemination and use within the PILOT .If you
402 need to process the data to remove sensitive identifiers, please cancel this transfer and upload the data to the Green Room to perform these actions.
403 To cancel this transfer, enter [n/No]
404 To confirm and proceed with the data transfer, enter [y/Yes]
405 [y/N]: y
406 Starting upload of: ./Chemical Tracking Data rev.csv
407 Pre-upload complete.
408 Uploading Chemical Tracking Data rev.csv: |██████████████████████████████ 100% 00:00
409 Upload Time: 2.92s for 1 files
410 All uploading jobs have finished.
411 {{/code}}
412
413 After completing the upload, you can confirm the new file “Chemical Tracking Data rev.csv" exists in the correct directory using the pilotcli file list command and/or in the Portal File Explorer.
414
415 {{code language="none"}}
416 collaborator4@jupyter-collaborator4:~$ pilotcli file list indoctestproject/collaborator4 -z core
417 Chemical Tracking Data rev.csv Chemical Tracking Data.csv
418 {{/code}}
419
420 [[image:1723798397694-530.png||height="217" width="863"]]
421
422 ----
423
424 Copyright © 2023-2026 [[Indoc Systems>>url:https://www.indocsystems.com]].
425
426 HealthDataCloud is powered by Pilot technology, a product of [[Indoc Systems>>url:https://www.indocsystems.com]].