Changes for page Using JupyterHub in HDC

Last modified by Dennis Segebarth on 2024/10/02 18:14

From version 1.1
edited by Susan Evans
on 2023/07/11 13:35
Change comment: (Autosaved)
To version 3.1
edited by Susan Evans
on 2023/07/11 14:11
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -1,3 +1,8 @@
1 +{{box cssClass="floatinginfobox" title="Table of Contents"}}
2 +{{toc depth="2"/}}
3 +{{/box}}
4 +
5 +
1 1  JupyterHub is an open-source, multi-user version of Jupyter Notebook for performing analysis of Project files in the Core. More information can be found in the application documentation [[https:~~/~~/jupyter.org/>>https://jupyter.org/]].
2 2  
3 3  = How it Works =
... ... @@ -19,7 +19,7 @@
19 19  
20 20  = Launching JupyterHub =
21 21  
22 -[[image:HDC Project Workspace tool navigation Jupyterhub v1.0.0 2023-05-25.png||height="10%" width="30%"]]
27 +[[image:HDC Project Workspace tool navigation Jupyterhub v1.0.0 2023-05-25.png||height="9%" width="25%"]]
23 23  
24 24  1. Launch your Project and click the **JupyterHub icon** in the left menu bar.
25 25  1. Click **Sign in with Keycloak** to initiate your session. JupyterHub automatically authenticates with your existing username and password and launches your session - no additional sign-in is required.
... ... @@ -180,8 +180,232 @@
180 180  
181 181  Afterwards, the environment will be listed when you open the Launcher to open a new Jupyter Notebook:
182 182  
188 +[[image:Project Workspace Jupyter view new Kernel 2023-07-11.png||height="25%" width="50%"]]
183 183  
184 184  
191 +and also from each opened Notebook, e.g., via **Kernel > Change Kernel…** :
185 185  
193 +[[image:Project Workspace Jupyter Kernel change Kernel dropdown 2023-07-11.png||height="32%" width="25%"]]
186 186  
195 += Installing New Python Packages =
196 +
197 +We highly recommend the use of virtual environments when installing new packages (see //Creating a Python Virtual Environment and Registering a Kernel// above for more details). Consequently, we recommend installing new packages via commands in the JupyterHub terminal in the corresponding virtual environments, instead of installing packages from within Jupyter Notebooks.
198 +
199 +Depending on the IT policies, outbound traffic may need to go through a proxy. If so, users will be required to provide the proxy command line argument such as pip, curl, wget, etc.
200 +
201 +For example:
202 +
203 +{{code language="none"}}
204 +pip install my_package
205 +{{/code}}
206 +
207 +If you are using conda to manage python packages:
208 +
209 +{{code language="none"}}
210 +conda install my_package
211 +{{/code}}
212 +
213 +The above information is provided as examples only. Please refer to documentation provided by your IT department with respect to proxy configuration.
214 +
215 += Using the Pilot Command Line Interface in a JupyterHub Terminal =
216 +
217 +The Pilot Command Line Interface (CLI) is deployed within JupyterHub as extension resource. Project members can use the Pilot Command Line Interface in a JupyterHub terminal to download Project data from the Core for further analysis, and upload the derivative outputs back to the Green Room or Core.
218 +
219 +The Home Directory is your default directory. When you download a copy of your Core files to JupyterHub, the files persists in the JupyterHub environment until deleted by you, so you can return to the session and continue your work at a later time without the need to retrieve the data from the Core again.
220 +
221 +The following sections focus on getting started with basic pilotcli commands in JupyterHub. For additional pilotcli commands and usage, see the article //Working with HDC Project Files in the Command Line Interface//.
222 +
223 +== Launching Pilot Command Line Interface ==
224 +
225 +1. Launch your Project and click the **JupyterHub** icon in in the workspace icon group.
226 +1. Click the **Terminal **launcher icon to open the Terminal.
227 +1. In the Jupyterhub Terminal, type {{code}}pilotcli{{/code}} to launch the latest version of the Pilot Command Line Interface.
228 +1. Use the {{code}}pilotcli --help{{/code}} at any time to show the welcome message again.
229 +
230 +{{code language="none"}}
231 +collaborator4@jupyter-collaborator4:~$ pilotcli
232 +Usage: pilotcli [OPTIONS] COMMAND [ARGS]...
233 +
234 + What's new (Version 2.2.0):
235 +
236 + 1. CLI supports to perform multi-threading upload for file/folders
237 +
238 + 2. CLI supports to perform resumable upload for single file
239 +
240 +
241 +
242 +Options:
243 + --help Show this message and exit.
244 +
245 +Commands:
246 + container_registry Container Registry Actions.
247 + dataset Dataset Actions.
248 + file File Actions.
249 + project Project Actions.
250 + use_config Config Actions.
251 + user User Actions.
252 +{{/code}}
253 +
254 +== Logging into the Pilot Command Line Interface ==
255 +
256 +Users are required to login with platform credentials before performing any tasks through Pilot Command Line Interface.
257 +
258 +* Use the command {{code}}pilotcli user login{{/code}} to log into the Pilot Command Line Interface.
259 +
260 +{{code language="none"}}
261 +collaborator4@jupyter-collaborator4:~$ pilotcli user login
262 +Please, access https://iam.staging.pilot.indocresearch.com/realms/pilot/device?user_code=XXXX-XXXX to proceed
263 + ▄▄▄▄▄▄▄ ▄ ▄▄ ▄ ▄▄▄▄ ▄ ▄▄▄▄▄▄▄
264 + █ ▄▄▄ █ ▄ ▄███ ▀▀ █▀ ▀██▄ █ ▄▄▄ █
265 + █ ▄ ▀ ▄ ▀▄ ▀▀ ▄█▀▄▀ ▀▀▄█▄▄▀ █████▄▄▀▄
266 + ▄▄▄▄▄▄▄ ▀ ▀█▄ ▀▄ ██▀█ ▄▀▄▄ █ ▄ █▀▄▄▄
267 + █ ▄▄▄ █ █▀█▄▀ █▀ █▀▀█ ▀▄█▄█▄▄▄█▀▄█
268 + █ ███ █ █▀██▀▄ █▀▄▄▀▀█▄▀▀█▄▀█ ▀ ▀▄▀██
269 + █▄▄▄▄▄█ ▄▀▄▄██▄▄▀▄ ▀▀▄ ▄▄▀▀▀▄ █▄▄▄█
187 187  
271 + Waiting validation finish...
272 +{{/code}}
273 +
274 +* (((
275 +You’ll be asked to validate your HDC user account using one of the provided methods.
276 +
277 +* Copy and paste the provided validation link into a new browser tab or
278 +* Scan the QR code with your mobile device.
279 +)))
280 +* Open the login window and enter your HDC username and password (i.e. your EBRAINS account credentials).
281 +* Grant access by clicking **Yes**.
282 +
283 +[[image:Pilotcli Jupyter user login Grant Access window v2.4.0 2023-05-25.png||height="28%" width="30%"]]
284 +
285 +[[image:Pilotcli Jupyter user login Device Login Successful v2.4.0 2023-05-25.png||height="10%" width="30%"]]
286 +
287 +* After successful confirmation, return to the terminal in your JupyterHub browser tab.
288 +
289 +{{code language="none"}}
290 +Welcome to the Command Line Tool!
291 +{{/code}}
292 +
293 +* You’re now ready to start using the Pilot Command Line Interface to work with your Project data in JupyterHub.
294 +
295 +== Zone Restrictions when using Pilot Command Line Interface in JupyterHub ==
296 +
297 +When using the Pilot Command Line Interface in JupyterHub and the following actions are possible on the derivative files generated in JupyterHub:
298 +
299 +|=(% colspan="1" rowspan="1" %)(((
300 +**File Operation**
301 +)))|=(% colspan="1" rowspan="1" %)(((
302 +**Permitted in the **
303 +**Green Room**
304 +)))|=(% colspan="1" rowspan="1" %)(((
305 +**Permitted in the **
306 +**Core**
307 +)))
308 +|(% colspan="1" rowspan="1" %)File upload 
309 +(upload derivative output files from JupyterHub to the Green Room or Core storage)|(% colspan="1" rowspan="1" %)(((
310 +Yes
311 +)))|(% colspan="1" rowspan="1" %)(((
312 +Yes
313 +)))
314 +|(% colspan="1" rowspan="1" %)File download
315 +(download files from Green Room or Core into JupyterHub)|(% colspan="1" rowspan="1" %)(((
316 +**No**
317 +)))|(% colspan="1" rowspan="1" %)(((
318 +Yes
319 +)))
320 +
321 +== Downloading Project Data to JupyterHub using the Pilot Command Line Interface ==
322 +
323 +After logging into the Pilot Command Line Interface, you can download data from the Project Core into the JupyterHub environment to start your data analyses.
324 +
325 +File related commands are grouped in the {{code}}file{{/code}} category. To view the full list of commands in this category, type {{code}}pilotcli file --help{{/code}}. To download project data, use the file sync command. To view the full list of commands in this category, type {{code}}pilotcli file sync --help{{/code}}.
326 +
327 +
328 +{{code language="none"}}
329 +collaborator4@jupyter-collaborator4:~$ pilotcli file sync --help
330 +Usage: pilotcli file sync [OPTIONS] [PATHS]... OUTPUT_PATH
331 +
332 + Download files/folders from a given Project/folder/file in core zone.
333 +
334 +Options:
335 + -z, --zone TEXT Target Zone (i.e., core/greenroom)
336 + --zip Download files as a zip.
337 + -i, --geid Enable downloading by geid.
338 + --help Show this message and exit.
339 +{{/code}}
340 +
341 +=== Example ===
342 +
343 +Downloading a file from the Core to your Home Directory:
344 +
345 +Reminder: Please follow Linux conventions for file management. If your filename contains spaces, wrap it in single or double quotes.
346 +
347 +* //Filename~:// “Chemical Tracking Data.csv”
348 +* //Source~:// Project “Indoc Test Project”, “Core” storage zone, folder “collaborator4” {{code}}indoctestproject/collaborator4/Chemical Tracking Data.csv -z core{{/code}}
349 +* //Destination: //user's Home directory in the Guacamole or JupyterHub VM {{code}}.{{/code}}
350 +* //Command group/option: //{{code}}file sync{{/code}}
351 +
352 +{{code language="none"}}
353 +collaborator4@jupyter-collaborator4:~$ pilotcli file sync indoctestproject/collaborator4/'Chemical Tracking Data.csv' . -z core
354 +start downloading...
355 +Downloading Chemical Tracking Data.csv |██████████████████████████████ 100% 00:00
356 +File has been downloaded successfully and saved to: ./Chemical Tracking Data.csv
357 +{{/code}}
358 +
359 +To confirm successful download, type {{code}}ls{{/code}} and verify the file "Chemical Tracking Data.csv" is stored in the Home folder.
360 +
361 +{{code language="none"}}
362 +collaborator4@jupyter-collaborator4:~$ ls
363 +'Chemical Tracking Data.csv' pilotcli
364 +{{/code}}
365 +
366 +The file “Chemical Tracking Data.csv” can be viewed in the JupyterHub graphical user interface:
367 +
368 +[[image:Jupyter downloaded file in Home folder v2.4.11 2023-05-25 1850.png||height="15%" width="50%"]]
369 +
370 +
371 +== Uploading Project Data from JupyterHub using the Pilot Command Line Interface ==
372 +
373 +After analyzing Project data inside the JupyterHub, you can upload the generated outputs back into the Project via the Pilot Command Line Interface.
374 +
375 +=== Example ===
376 +
377 +* //Filename//: Chemical Tracking Data rev.csv
378 +* //Source~:// user's Home directory in JupyterHub {{code}}.{{/code}}
379 +* //Destination//: Project “Indoc Test Project”, folder “collaborator4”, “Core” storage zone,
380 +{{code}}indoctestproject/collaborator4{{/code}} {{code}}-z core{{/code}}
381 +* //Command group/option~:// {{code}}file upload{{/code}}
382 +* //User message// (for upload back to the Core): “my workbench output, no additional sensitive data"
383 +* //Command~:// {{code}}pilotcli file upload ./'Chemical Tracking Data rev.csv' -p{{/code}} {{code}}indoctestproject/collaborator4 -z core -m "my workbench output, no additional sensitive data"{{/code}}
384 +
385 +When uploading data to the Core, you are reminded that you are bypassing the usual Green Room upload workflow. To confirm, type {{code}}y{{/code}} at the prompt, or {{code}}N{{/code}} to cancel.
386 +
387 +{{code language="none"}}
388 +collaborator4@jupyter-collaborator4:~$ pilotcli file upload ./'Chemical Tracking Data rev.csv' -p indoctestproject/collaborator4 -z core -m "my workbench output, no additional sensitive data"
389 +You are about to transfer data directly to the PILOT Core! In accordance with the PILOT Terms of Use, please confirm that you have made your best efforts to
390 +pseudonymize or anonymize the data and that you have the legal authority to transfer and make this data available for dissemination and use within the PILOT .If you
391 +need to process the data to remove sensitive identifiers, please cancel this transfer and upload the data to the Green Room to perform these actions.
392 +To cancel this transfer, enter [n/No]
393 +To confirm and proceed with the data transfer, enter [y/Yes]
394 + [y/N]: y
395 +Starting upload of: ./Chemical Tracking Data rev.csv
396 +Pre-upload complete.
397 +Uploading Chemical Tracking Data rev.csv: |██████████████████████████████ 100% 00:00
398 +Upload Time: 2.92s for 1 files
399 +All uploading jobs have finished.
400 +{{/code}}
401 +
402 +After completing the upload, you can confirm the new file “Chemical Tracking Data rev.csv" exists in the correct directory using the pilotcli file list command and/or in the Portal File Explorer.
403 +
404 +{{code language="none"}}
405 +collaborator4@jupyter-collaborator4:~$ pilotcli file list indoctestproject/collaborator4 -z core
406 +Chemical Tracking Data rev.csv Chemical Tracking Data.csv
407 +{{/code}}
408 +
409 +[[image:Jupyterhub file upload back to core v2.4.11 2023-05-25 1926.png||height="13%" width="50%"]]
410 +
411 +----
412 +
413 +Copyright © 2023 [[Indoc Research>>url:https://www.indocresearch.org/]].
414 +
415 +HealthDataCloud is powered by Pilot technology, a product of [[Indoc Research>>url:https://www.indocresearch.org/]].
416 +
Jupyter downloaded file in Home folder v2.4.11 2023-05-25 1850.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.sgevans
Size
... ... @@ -1,0 +1,1 @@
1 +35.0 KB
Content
Jupyterhub file upload back to core v2.4.11 2023-05-25 1926.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.sgevans
Size
... ... @@ -1,0 +1,1 @@
1 +507.5 KB
Content
Pilotcli Jupyter user login Device Login Successful v2.4.0 2023-05-25.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.sgevans
Size
... ... @@ -1,0 +1,1 @@
1 +10.3 KB
Content
Pilotcli Jupyter user login Grant Access window v2.4.0 2023-05-25.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.sgevans
Size
... ... @@ -1,0 +1,1 @@
1 +13.9 KB
Content
Project Workspace Jupyter Kernel change Kernel dropdown 2023-07-11.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.sgevans
Size
... ... @@ -1,0 +1,1 @@
1 +38.9 KB
Content
Project Workspace Jupyter view new Kernel 2023-07-11.png
Author
... ... @@ -1,0 +1,1 @@
1 +XWiki.sgevans
Size
... ... @@ -1,0 +1,1 @@
1 +490.0 KB
Content