Leave Deep Learning Process Running on AWS and Reconnect Later

0 votes
I have been doing some deep learning on simple datasets that can run in under 1 hour. However, as I am starting to work on bigger projects I want to run them on AWS and still some of them need a very long time to run (several hours if not days). Ideally, I would like to be able to leave them running, switch off my laptop and come back at a later date and check on them.

My questions is essentially:

1, Once I have ssh'd into my instance and set the Python script running, is there anything I need to type to tell it that I will be leaving it?

2, Can I just exit the terminal directly and turn my laptop off or will this interrupt the process?

3, When I come back at a later date and ssh back into the terminal, what do I need to type to reconnect with the process that has been running?

4, How can I check how far it has progressed/how far it has left to go (my Python script outputs number of epochs/batches)
Apr 24, 2018 in AWS by Flying geek
• 3,280 points

1 answer to this question.

0 votes

You can run the process in a screen which will allow you to disconnect and reconnect without interrupting the process.

  1. SSH into instance
  2. Type 'screen'
  3. Run script
  4. Ctrl+a Ctrl+d to detach

You can now disconnect.

Reattach to the screen by typing 'screen -r'

You can have multiple screens and you can simply attach to this using the PID

For more information on screen Screen User's Manual

answered Apr 24, 2018 by Cloud gunner
• 4,670 points

