QMUL EECS GPU Service Instruction Changjae Oh Updates • 29 Jun 2022 • Combined the existing document for 21/22 Final Project and the current GPU option (JP module) • Added useful tips from former JP students (Huaixi Tang (19/20), Hengyi Wang, Chaoran Zhu (20/21), Yehan Fan (21/22)) • 04 Jul 2022 • Conda create revised • 11 May 2023 • Conda create revised 0. Spec • NVIDIA A40 (48G) x 4 • Storage 50G • This is not for dataset if your dataset is too large • We have a separate storage to ONLY download the dataset • To download the dataset, email me with the dataset details (why, size) and download instruction 1. Set up your EECS account • You first need to set up an EECS account (different from the QMUL/QMPlus account). The EECS account will give you access to our EECS JuypterHub where you can create a GPU-enable virtual server for your project. • You should have received your username and a one-time account unlocking code in your QMUL email. Now set up your own password as follows: • • • • Visit https://www.eecs.qmul.ac.uk/password (Firefox or Chrome recommended) Read the instructions on screen carefully Enter your username and unlocking code Think of a new EECS password and enter that two times (Please make sure that you will remember it! Not just save it in the browser) 2. Login EECS JuypterHub • TO WATCH A DEMONSTRATION VIDEO: • https://media.qmplus.qmul.ac.uk/media/BBC6521+EECS+JuypterHub+ML+Server+Usage/1_lnslk2n3 • Visit https://jhub.eecs.qmul.ac.uk/ (Firefox or Chrome recommended) 2. Login EECS JuypterHub • Enter your EECS username and password. You will then be directed to the Hub Control Panel where you can find a list of servers. Locate “Project (JP)" and click "Start" 2. Login EECS JuypterHub • It would take a short moment for the server to start. Please be patient and wait. • You will be directed to a workspace with the file browser, Launcher and various menus: 3. Transferring files to/from servers • To upload your files, you can simply drag and drop your files into the file browser of the workspace. You can download a file by right-click and choose "Download". This should be similar to most online Integrated Development Environment (IDE). • BUT this can be slow if you are outside of the UK. • If you want to download any large-size data, directly download from the server • You can use wget (or other commands) (google “wget command linux”) 4. Test your GPU • You can now import the basic usage notebook "bbc6521_gpu.ipynb". Then go through the notebook and learn how to work with the server. • Also, check out: BBC6521_GPU_Notebook.pdf • A detailed documentation of JuypterHub can be found at: https://jupyter.org/documentation • The example code may not work well as the Notebook version is changed! Then, disregard this stage. 5. More practical way • For deep learning, using terminal can be easier than using Jupyter Notebook • Disconnection happens very often if you use Jupyter Notebook. • Use terminal for setting up your environment (e.g. anaconda, library install (pip install)) and training your model • Go to Terminal and type nvidia-smi to check if you can see the GPU status Tips: Some useful Linux commands • Check the GPU usage • nvidia-smi • top # To check who is using • Keep running gpus after you log out • nohup python -u [path-to-your-python-code] >[[path-to-your-output].out] 2>&1 & • Choose the gpus you want to use • export CUDA_VISIBLE_DEVICES= 0 • or 1 or 2 or 3 (GPU numbers when you check nvidia-smi). • Do not use all gpus at a time • Note that sudo commands are not available in the server • You should enquire Dr Matthew Tang (cc me) if you need to install something with sudo • And discuss with me first Tips: Conda environment • You may have heard about anaconda, a platform to deal with various data science packages. • You can create a virtual environment which includes the libraries with the version you want to install • See this webpage: https://towardsdatascience.com/setting-up-a-new-pytorch-deep-learning-environment313d8d1c2df0 (You don’t need to install anaconda in QM GPU as it is already installed. So please check the above website from “Create a virtual environment”) • At QM GPU, conda create -n test_env is not enough as the installed environment can be reset when the Jupyter Server restarts! (This happened last year) • So, please make one folder in your user folder and install the conda environment there. This kind of command would work • Make a folder to store conda environments: mkdir test_cj_conda_env • Create a conda env.: conda create --prefix ./test_cj_conda_env/testenv python=3.6 • make a folder (I named as test_cj_conda_env) and install your conda environment (I named as test_cj_conda_env) in the folder. This won't be deleted even the server is reset More tips welcome! • Note that some instructions are from your seniors (JP final project in 20/21, 21/22) and me after we struggled many times • (Thanks Huaixi Tang, Hengyi Wang, Chaoran Zhu, Yehan Fan!) • Please share your tips for using QM GPUs more easily and contribute to this manual! • E.g. Hope one of you come back to me with “HOW_TO_USE_TENSORBOARD” ☺