Changes for page Using JupyterHub in HDC

Last modified by Dennis Segebarth on 2024/10/02 18:14

From version 1.1
edited by Susan Evans
on 2023/07/11 13:35
Change comment: (Autosaved)
To version 19.1
edited by Dennis Segebarth
on 2024/10/02 18:14
Change comment: There is no comment for this version

Summary

Details

Page properties
Author
... ... @@ -1,1 +1,1 @@
1 -XWiki.sgevans
1 +XWiki.dsegebarth
Content
... ... @@ -1,5 +1,13 @@
1 +{{box cssClass="floatinginfobox" title="Table of Contents"}}
2 +{{toc depth="2"/}}
3 +{{/box}}
4 +
5 +
1 1  JupyterHub is an open-source, multi-user version of Jupyter Notebook for performing analysis of Project files in the Core. More information can be found in the application documentation [[https:~~/~~/jupyter.org/>>https://jupyter.org/]].
2 2  
8 +[[image:1723798240534-407.png||height="189" width="291"]]
9 +
10 +
3 3  = How it Works =
4 4  
5 5  JupyterHub allows Project members to create or import Jupyter Notebooks into the Project Workspace environment, retrieve Project files from the Core, perform computational workflows on the data, and write the outputs back to the Core where they can be accessed by other Project members. JupyterHub spins up a new JupyterLab instance for each Project member.
... ... @@ -17,9 +17,13 @@
17 17  
18 18  JupyterHub is configured at the time of Project Setup. If you launch JupyterHub and receive a notice that it hasn’t been deployed for your project, please contact your Platform Administrator.
19 19  
28 +{{info}}
29 +If you access JupyterHub of the HealthDataCloud Test Project, please be aware that the resources are limited for each user to: 2 GB of persistent storage volume, 4 GB memory, and a single CPU. These limitations can easily be adjusted for new Projects.
30 +{{/info}}
31 +
20 20  = Launching JupyterHub =
21 21  
22 -[[image:HDC Project Workspace tool navigation Jupyterhub v1.0.0 2023-05-25.png||height="10%" width="30%"]]
34 +[[image:1723798257792-201.png||height="121" width="349"]]
23 23  
24 24  1. Launch your Project and click the **JupyterHub icon** in the left menu bar.
25 25  1. Click **Sign in with Keycloak** to initiate your session. JupyterHub automatically authenticates with your existing username and password and launches your session - no additional sign-in is required.
... ... @@ -34,7 +34,7 @@
34 34  1. In the Launcher, click the **Python 3 Notebook **icon, or click **File > New > Notebook**.
35 35  1. Create your Notebook.
36 36  
37 -[[image:Project Workspace Jupyter Create Python Notebook v2.1.6 2023-02-07.png||height="22%" width="50%"]]
49 +[[image:1723798278604-114.png||height="376" width="865"]]
38 38  
39 39  = Launching the Terminal =
40 40  
... ... @@ -43,7 +43,7 @@
43 43  1. In the Launcher, click the **Terminal **icon, or click **File > New > Terminal**.
44 44  1. The terminal window opens.
45 45  
46 -[[image:Project Workspace Jupyter Launch Terminal v2.1.6 2023-02-07.png||height="9%" width="50%"]]
58 +[[image:1723798293872-992.png||height="162" width="863"]]
47 47  
48 48  Ubuntu is used to host Jupyter Notebook. Use the command cat /etc/os-release to determine to current version of Ubuntu:
49 49  
... ... @@ -180,8 +180,231 @@
180 180  
181 181  Afterwards, the environment will be listed when you open the Launcher to open a new Jupyter Notebook:
182 182  
195 +[[image:1723798325144-485.png||height="436" width="867"]]
183 183  
184 184  
198 +and also from each opened Notebook, e.g., via **Kernel > Change Kernel…** :
185 185  
200 +[[image:1723798338447-469.png||height="317" width="247"]]
186 186  
202 += Installing New Python Packages =
203 +
204 +We highly recommend the use of virtual environments when installing new packages (see //Creating a Python Virtual Environment and Registering a Kernel// above for more details). Consequently, we recommend installing new packages via commands in the JupyterHub terminal in the corresponding virtual environments, instead of installing packages from within Jupyter Notebooks.
205 +
206 +Depending on the IT policies, outbound traffic may need to go through a proxy. If so, users will be required to provide the proxy command line argument such as pip, curl, wget, etc.
207 +
208 +For example:
209 +
210 +{{code language="none"}}
211 +pip install my_package
212 +{{/code}}
213 +
214 +If you are using conda to manage python packages:
215 +
216 +{{code language="none"}}
217 +conda install my_package
218 +{{/code}}
219 +
220 +The above information is provided as examples only. Please refer to documentation provided by your IT department with respect to proxy configuration.
221 +
222 += Using the Pilot Command Line Interface in a JupyterHub Terminal =
223 +
224 +The Pilot Command Line Interface (CLI) is deployed within JupyterHub as extension resource. Project members can use the Pilot Command Line Interface in a JupyterHub terminal to download Project data from the Core for further analysis, and upload the derivative outputs back to the Green Room or Core.
225 +
226 +The Home Directory is your default directory. When you download a copy of your Core files to JupyterHub, the files persists in the JupyterHub environment until deleted by you, so you can return to the session and continue your work at a later time without the need to retrieve the data from the Core again.
227 +
228 +The following sections focus on getting started with basic pilotcli commands in JupyterHub. For additional pilotcli commands and usage, see the article //Working with HDC Project Files in the Command Line Interface//.
229 +
230 +== Launching Pilot Command Line Interface ==
231 +
232 +1. Launch your Project and click the **JupyterHub** icon in in the workspace icon group.
233 +1. Click the **Terminal **launcher icon to open the Terminal.
234 +1. In the Jupyterhub Terminal, type {{code}}pilotcli{{/code}} to launch the latest version of the Pilot Command Line Interface.
235 +1. Use the {{code}}pilotcli --help{{/code}} at any time to show the welcome message again.
236 +
237 +{{code language="none"}}
238 +collaborator4@jupyter-collaborator4:~$ pilotcli
239 +Usage: pilotcli [OPTIONS] COMMAND [ARGS]...
240 +
241 + What's new (Version 2.2.0):
242 +
243 + 1. CLI supports to perform multi-threading upload for file/folders
244 +
245 + 2. CLI supports to perform resumable upload for single file
246 +
247 +
248 +
249 +Options:
250 + --help Show this message and exit.
251 +
252 +Commands:
253 + container_registry Container Registry Actions.
254 + dataset Dataset Actions.
255 + file File Actions.
256 + project Project Actions.
257 + use_config Config Actions.
258 + user User Actions.
259 +{{/code}}
260 +
261 +== Logging into the Pilot Command Line Interface ==
262 +
263 +Users are required to login with platform credentials before performing any tasks through Pilot Command Line Interface.
264 +
265 +* Use the command {{code}}pilotcli user login{{/code}} to log into the Pilot Command Line Interface.
266 +
267 +{{code language="none"}}
268 +collaborator4@jupyter-collaborator4:~$ pilotcli user login
269 +Please, access https://iam.staging.pilot.indocresearch.com/realms/pilot/device?user_code=XXXX-XXXX to proceed
270 + ▄▄▄▄▄▄▄ ▄ ▄▄ ▄ ▄▄▄▄ ▄ ▄▄▄▄▄▄▄
271 + █ ▄▄▄ █ ▄ ▄███ ▀▀ █▀ ▀██▄ █ ▄▄▄ █
272 + █ ▄ ▀ ▄ ▀▄ ▀▀ ▄█▀▄▀ ▀▀▄█▄▄▀ █████▄▄▀▄
273 + ▄▄▄▄▄▄▄ ▀ ▀█▄ ▀▄ ██▀█ ▄▀▄▄ █ ▄ █▀▄▄▄
274 + █ ▄▄▄ █ █▀█▄▀ █▀ █▀▀█ ▀▄█▄█▄▄▄█▀▄█
275 + █ ███ █ █▀██▀▄ █▀▄▄▀▀█▄▀▀█▄▀█ ▀ ▀▄▀██
276 + █▄▄▄▄▄█ ▄▀▄▄██▄▄▀▄ ▀▀▄ ▄▄▀▀▀▄ █▄▄▄█
187 187  
278 + Waiting validation finish...
279 +{{/code}}
280 +
281 +* (((
282 +You’ll be asked to validate your HDC user account using one of the provided methods.
283 +
284 +* Copy and paste the provided validation link into a new browser tab or
285 +* Scan the QR code with your mobile device.
286 +)))
287 +* Open the login window and enter your HDC username and password (i.e. your EBRAINS account credentials).
288 +* Grant access by clicking **Yes**.
289 +
290 +[[image:1723798355215-434.png||height="352" width="379"]]
291 +
292 +[[image:1723798365454-527.png||height="123" width="376"]]
293 +
294 +* After successful confirmation, return to the terminal in your JupyterHub browser tab.
295 +
296 +{{code language="none"}}
297 +Welcome to the Command Line Tool!
298 +{{/code}}
299 +
300 +* You’re now ready to start using the Pilot Command Line Interface to work with your Project data in JupyterHub.
301 +
302 +== Zone Restrictions when using Pilot Command Line Interface in JupyterHub ==
303 +
304 +When using the Pilot Command Line Interface in JupyterHub and the following actions are possible on the derivative files generated in JupyterHub:
305 +
306 +|=(% colspan="1" rowspan="1" %)(((
307 +**File Operation**
308 +)))|=(% colspan="1" rowspan="1" %)(((
309 +**Permitted in the **
310 +**Green Room**
311 +)))|=(% colspan="1" rowspan="1" %)(((
312 +**Permitted in the **
313 +**Core**
314 +)))
315 +|(% colspan="1" rowspan="1" %)File upload 
316 +(upload derivative output files from JupyterHub to the Green Room or Core storage)|(% colspan="1" rowspan="1" %)(((
317 +Yes
318 +)))|(% colspan="1" rowspan="1" %)(((
319 +Yes
320 +)))
321 +|(% colspan="1" rowspan="1" %)File download
322 +(download files from Green Room or Core into JupyterHub)|(% colspan="1" rowspan="1" %)(((
323 +**No**
324 +)))|(% colspan="1" rowspan="1" %)(((
325 +Yes
326 +)))
327 +
328 +== Downloading Project Data to JupyterHub using the Pilot Command Line Interface ==
329 +
330 +After logging into the Pilot Command Line Interface, you can download data from the Project Core into the JupyterHub environment to start your data analyses.
331 +
332 +File related commands are grouped in the {{code}}file{{/code}} category. To view the full list of commands in this category, type {{code}}pilotcli file --help{{/code}}. To download project data, use the file sync command. To view the full list of commands in this category, type {{code}}pilotcli file sync --help{{/code}}.
333 +
334 +
335 +{{code language="none"}}
336 +collaborator4@jupyter-collaborator4:~$ pilotcli file sync --help
337 +Usage: pilotcli file sync [OPTIONS] [PATHS]... OUTPUT_PATH
338 +
339 + Download files/folders from a given Project/folder/file in core zone.
340 +
341 +Options:
342 + -z, --zone TEXT Target Zone (i.e., core/greenroom)
343 + --zip Download files as a zip.
344 + -i, --geid Enable downloading by geid.
345 + --help Show this message and exit.
346 +{{/code}}
347 +
348 +=== Example ===
349 +
350 +Downloading a file from the Core to your Home Directory:
351 +
352 +Reminder: Please follow Linux conventions for file management. If your filename contains spaces, wrap it in single or double quotes.
353 +
354 +* //Filename~:// “Chemical Tracking Data.csv”
355 +* //Source~:// Project “Indoc Test Project”, “Core” storage zone, folder “collaborator4” {{code}}indoctestproject/collaborator4/Chemical Tracking Data.csv -z core{{/code}}
356 +* //Destination: //user's Home directory in the Guacamole or JupyterHub VM {{code}}.{{/code}}
357 +* //Command group/option: //{{code}}file sync{{/code}}
358 +
359 +{{code language="none"}}
360 +collaborator4@jupyter-collaborator4:~$ pilotcli file sync indoctestproject/collaborator4/'Chemical Tracking Data.csv' . -z core
361 +start downloading...
362 +Downloading Chemical Tracking Data.csv |██████████████████████████████ 100% 00:00
363 +File has been downloaded successfully and saved to: ./Chemical Tracking Data.csv
364 +{{/code}}
365 +
366 +To confirm successful download, type {{code}}ls{{/code}} and verify the file "Chemical Tracking Data.csv" is stored in the Home folder.
367 +
368 +{{code language="none"}}
369 +collaborator4@jupyter-collaborator4:~$ ls
370 +'Chemical Tracking Data.csv' pilotcli
371 +{{/code}}
372 +
373 +The file “Chemical Tracking Data.csv” can be viewed in the JupyterHub graphical user interface:
374 +
375 +[[image:1723798383409-873.png||height="267" width="874"]]
376 +
377 +
378 +== Uploading Project Data from JupyterHub using the Pilot Command Line Interface ==
379 +
380 +After analyzing Project data inside the JupyterHub, you can upload the generated outputs back into the Project via the Pilot Command Line Interface.
381 +
382 +=== Example ===
383 +
384 +* //Filename//: Chemical Tracking Data rev.csv
385 +* //Source~:// user's Home directory in JupyterHub {{code}}.{{/code}}
386 +* //Destination//: Project “Indoc Test Project”, folder “collaborator4”, “Core” storage zone,
387 +{{code}}indoctestproject/collaborator4{{/code}} {{code}}-z core{{/code}}
388 +* //Command group/option~:// {{code}}file upload{{/code}}
389 +* //User message// (for upload back to the Core): “my workbench output, no additional sensitive data"
390 +* //Command~:// {{code}}pilotcli file upload ./'Chemical Tracking Data rev.csv' -p{{/code}} {{code}}indoctestproject/collaborator4 -z core -m "my workbench output, no additional sensitive data"{{/code}}
391 +
392 +When uploading data to the Core, you are reminded that you are bypassing the usual Green Room upload workflow. To confirm, type {{code}}y{{/code}} at the prompt, or {{code}}N{{/code}} to cancel.
393 +
394 +{{code language="none"}}
395 +collaborator4@jupyter-collaborator4:~$ pilotcli file upload ./'Chemical Tracking Data rev.csv' -p indoctestproject/collaborator4 -z core -m "my workbench output, no additional sensitive data"
396 +You are about to transfer data directly to the PILOT Core! In accordance with the PILOT Terms of Use, please confirm that you have made your best efforts to
397 +pseudonymize or anonymize the data and that you have the legal authority to transfer and make this data available for dissemination and use within the PILOT .If you
398 +need to process the data to remove sensitive identifiers, please cancel this transfer and upload the data to the Green Room to perform these actions.
399 +To cancel this transfer, enter [n/No]
400 +To confirm and proceed with the data transfer, enter [y/Yes]
401 + [y/N]: y
402 +Starting upload of: ./Chemical Tracking Data rev.csv
403 +Pre-upload complete.
404 +Uploading Chemical Tracking Data rev.csv: |██████████████████████████████ 100% 00:00
405 +Upload Time: 2.92s for 1 files
406 +All uploading jobs have finished.
407 +{{/code}}
408 +
409 +After completing the upload, you can confirm the new file “Chemical Tracking Data rev.csv" exists in the correct directory using the pilotcli file list command and/or in the Portal File Explorer.
410 +
411 +{{code language="none"}}
412 +collaborator4@jupyter-collaborator4:~$ pilotcli file list indoctestproject/collaborator4 -z core
413 +Chemical Tracking Data rev.csv Chemical Tracking Data.csv
414 +{{/code}}
415 +
416 +[[image:1723798397694-530.png||height="217" width="863"]]
417 +
418 +----
419 +
420 +Copyright © 2023-2024 [[Indoc Systems>>url:https://www.indocsystems.com]].
421 +
422 +HealthDataCloud is powered by Pilot technology, a product of [[Indoc Systems>>url:https://www.indocsystems.com]].
HDC Project Workspace tool navigation Jupyterhub v1.0.0 2023-05-25.png
Author
... ... @@ -1,1 +1,0 @@
1 -XWiki.sgevans
Size
... ... @@ -1,1 +1,0 @@
1 -40.6 KB
Content
Project Workspace Jupyter Create Python Notebook v2.1.6 2023-02-07.png
Author
... ... @@ -1,1 +1,0 @@
1 -XWiki.sgevans
Size
... ... @@ -1,1 +1,0 @@
1 -793.7 KB
Content
Project Workspace Jupyter Launch Terminal v2.1.6 2023-02-07.png
Author
... ... @@ -1,1 +1,0 @@
1 -XWiki.sgevans
Size
... ... @@ -1,1 +1,0 @@
1 -346.3 KB
Content
1723798240534-407.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.dsegebarth
Size
... ... @@ -1,0 +1,1 @@
1 +18.0 KB
Content
1723798257792-201.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.dsegebarth
Size
... ... @@ -1,0 +1,1 @@
1 +42.2 KB
Content
1723798278604-114.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.dsegebarth
Size
... ... @@ -1,0 +1,1 @@
1 +216.4 KB
Content
1723798293872-992.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.dsegebarth
Size
... ... @@ -1,0 +1,1 @@
1 +113.5 KB
Content
1723798325144-485.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.dsegebarth
Size
... ... @@ -1,0 +1,1 @@
1 +423.2 KB
Content
1723798338447-469.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.dsegebarth
Size
... ... @@ -1,0 +1,1 @@
1 +27.1 KB
Content
1723798355215-434.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.dsegebarth
Size
... ... @@ -1,0 +1,1 @@
1 +14.1 KB
Content
1723798365454-527.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.dsegebarth
Size
... ... @@ -1,0 +1,1 @@
1 +10.0 KB
Content
1723798383409-873.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.dsegebarth
Size
... ... @@ -1,0 +1,1 @@
1 +30.1 KB
Content
1723798397694-530.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.dsegebarth
Size
... ... @@ -1,0 +1,1 @@
1 +166.8 KB
Content