When I create and remove files rapidly on windows using python I get WindowsError Error 5

I came across the above-stated problem when using Scrapy's FifoDiskQueue. In windows, FifoDiskQueue will cause directories and files to be created by one file descriptor and consumed by another file descriptor.

Sometimes I'll randomly get error messages like this one:

2015-08-25 18:51:30 [scrapy] INFO: Error while handling downloader output
Traceback (most recent call last):
  File "C:\Python27\lib\site-packages\twisted\internet\defer.py", line 588, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "C:\Python27\lib\site-packages\scrapy\core\engine.py", line 154, in _handle_downloader_output
    self.crawl(response, spider)
  File "C:\Python27\lib\site-packages\scrapy\core\engine.py", line 182, in crawl
    self.schedule(request, spider)
  File "C:\Python27\lib\site-packages\scrapy\core\engine.py", line 188, in schedule
    if not self.slot.scheduler.enqueue_request(request):
  File "C:\Python27\lib\site-packages\scrapy\core\scheduler.py", line 54, in enqueue_request
    dqok = self._dqpush(request)
  File "C:\Python27\lib\site-packages\scrapy\core\scheduler.py", line 83, in _dqpush
    self.dqs.push(reqd, -request.priority)
  File "C:\Python27\lib\site-packages\queuelib\pqueue.py", line 33, in push
    self.queues[priority] = self.qfactory(priority)
  File "C:\Python27\lib\site-packages\scrapy\core\scheduler.py", line 106, in _newdq
    return self.dqclass(join(self.dqdir, 'p%s' % priority))
  File "C:\Python27\lib\site-packages\queuelib\queue.py", line 43, in __init__
  File "C:\Python27\lib\os.py", line 157, in makedirs
    mkdir(name, mode)
WindowsError: [Error 5] : './sogou_job\\requests.queue\\p-50'

What I learned after a little research is that Error 5 means access is denied. A lot of explanations on the web quote the reason as lacking administrative rights, like this MSDN post, But the reason is not related to access rights. When I run the scrapy crawl command as an Administrator on command prompt, the problem still occurs.

I have also created a small test to try on windows and linux:

import os
import shutil
import time

for i in range(1000):
    somedir = "testingdir"
        with open(os.path.join(somedir, "testing.txt"), 'w') as out:
            out.write("Oh no")
    except WindowsError as e:
        print 'round', i, e

And the output of the above test code is as follows:

round 13 [Error 5] : 'testingdir'
Traceback (most recent call last):
  File "E:\FHT360\FHT360_Mobile\Source\keywordranks\test.py", line 10, in <module>
  File "C:\Users\yj\Anaconda\lib\os.py", line 157, in makedirs
    mkdir(name, mode)
WindowsError: [Error 5] : 'testingdir'

The round is different every time. So if I remove the raise in the end, I will get something like this:

round 5 [Error 5] : 'testingdir'
round 67 [Error 5] : 'testingdir'
round 589 [Error 5] : 'testingdir'
round 875 [Error 5] : 'testingdir'

It simply fails randomly, with a small probability, ONLY on Windows. I tried this test script in cygwin and linux, this error never happens there. I also tried the same code in another Windows machine and it occurs there.

Aug 30, 2018 in Python by aryya
• 7,450 points

1 answer to this question.

Here's the short answer:

disable any antivirus or document indexing or at least configure them not to scan your working directory.

Long Answer: you can spend months trying to fix this kind of problem, so far the only workaround that does not involve disabling the antivirus is to assume that you will not be able to remove all files or directories.

Assume this in your code and try to use a different root subdirectory when the service starts and trying to clean-up the older ones, ignoring the removal failures.

answered Aug 31, 2018 by charlie_brown
• 7,720 points

