When I try to use the whole dataset for training, the kernel would be down when reach batch 15. But if I only iterate through the whole dataset, there’s no problem.
I just modify a line from the tutorial notebook.
When I try to use the whole dataset for training, the kernel would be down when reach batch 15. But if I only iterate through the whole dataset, there’s no problem.
I just modify a line from the tutorial notebook.
Anyone encounters the same issue?
Are you sure you manage your memory correctly?
If you iterate batch by batch through the whole set using the generator, each batch replaces the previous one, so your memory only fluctuates by the delta of memory between two batches. However, from your error above, it looks like you are loading consecutively (rather than replacing) batches, which will results in your RAM to overflow around the 15th batch. Any time you overflow your memory, your memory gets flushed and your kernel restarts.
To train the model with 4Go of RAM, you need to do it iteratively.
Hi Herve,
I do replace the previous batch. I only modified that line of code. So I think I correctly do it in a iterative way. Can you try and see whether you will face the same problem? I noticed that the batch sizes are different over different batches. I’m afraid that 4GB RAM is not enough for the 15th batch and the model at that moment. I will take a look at details later. Thanks!