当前位置:   article > 正文

Python 日志logging模块初探及多线程踩坑(2)_timedrotatingfilehandler 多线程

timedrotatingfilehandler 多线程

系列文章:

Python 日志logging模块初探及多线程踩坑(1)

Python 日志logging模块初探及多线程踩坑(2)


接着上面一篇文章,我们这篇来写一个多进程兼容且无损性能的 TimedRotatingFileHandler

TimedRotatingFileHandler 是 Python 官方提供的一个可以基于时间自动切分日志的 Handler 类。但是它是没有办法支持多进程的日志切换,多进程进行日志切换的时候会丢失日志数据,导致日志错乱。在win10上还会报错:‘win32Error :  文件被占用的错误’.

我们先好好看看源码,究竟是什么问题导致它的线程不安全:

  1. class TimedRotatingFileHandler(BaseRotatingHandler):
  2. """
  3. Handler for logging to a file, rotating the log file at certain timed
  4. intervals.
  5. If backupCount is > 0, when rollover is done, no more than backupCount
  6. files are kept - the oldest ones are deleted.
  7. """
  8. def __init__(self, filename, when='h', interval=1, backupCount=0, encoding=None, delay=False, utc=False, atTime=None):
  9. BaseRotatingHandler.__init__(self, filename, 'a', encoding, delay)
  10. self.when = when.upper()
  11. self.backupCount = backupCount
  12. self.utc = utc
  13. self.atTime = atTime
  14. # Calculate the real rollover interval, which is just the number of
  15. # seconds between rollovers. Also set the filename suffix used when
  16. # a rollover occurs. Current 'when' events supported:
  17. # S - Seconds
  18. # M - Minutes
  19. # H - Hours
  20. # D - Days
  21. # midnight - roll over at midnight
  22. # W{0-6} - roll over on a certain day; 0 - Monday
  23. #
  24. # Case of the 'when' specifier is not important; lower or upper case
  25. # will work.
  26. if self.when == 'S':
  27. self.interval = 1 # one second
  28. self.suffix = "%Y-%m-%d_%H-%M-%S"
  29. self.extMatch = r"^\d{4}-\d{2}-\d{2}_\d{2}-\d{2}-\d{2}(\.\w+)?$"
  30. elif self.when == 'M':
  31. self.interval = 60 # one minute
  32. self.suffix = "%Y-%m-%d_%H-%M"
  33. self.extMatch = r"^\d{4}-\d{2}-\d{2}_\d{2}-\d{2}(\.\w+)?$"
  34. elif self.when == 'H':
  35. self.interval = 60 * 60 # one hour
  36. self.suffix = "%Y-%m-%d_%H"
  37. self.extMatch = r"^\d{4}-\d{2}-\d{2}_\d{2}(\.\w+)?$"
  38. elif self.when == 'D' or self.when == 'MIDNIGHT':
  39. self.interval = 60 * 60 * 24 # one day
  40. self.suffix = "%Y-%m-%d"
  41. self.extMatch = r"^\d{4}-\d{2}-\d{2}(\.\w+)?$"
  42. elif self.when.startswith('W'):
  43. self.interval = 60 * 60 * 24 * 7 # one week
  44. if len(self.when) != 2:
  45. raise ValueError("You must specify a day for weekly rollover from 0 to 6 (0 is Monday): %s" % self.when)
  46. if self.when[1] < '0' or self.when[1] > '6':
  47. raise ValueError("Invalid day specified for weekly rollover: %s" % self.when)
  48. self.dayOfWeek = int(self.when[1])
  49. self.suffix = "%Y-%m-%d"
  50. self.extMatch = r"^\d{4}-\d{2}-\d{2}(\.\w+)?$"
  51. else:
  52. raise ValueError("Invalid rollover interval specified: %s" % self.when)
  53. self.extMatch = re.compile(self.extMatch, re.ASCII)
  54. self.interval = self.interval * interval # multiply by units requested
  55. # The following line added because the filename passed in could be a
  56. # path object (see Issue #27493), but self.baseFilename will be a string
  57. filename = self.baseFilename
  58. if os.path.exists(filename):
  59. t = os.stat(filename)[ST_MTIME]
  60. else:
  61. t = int(time.time())
  62. self.rolloverAt = self.computeRollover(t)
  63. def computeRollover(self, currentTime):
  64. """
  65. Work out the rollover time based on the specified time.
  66. """
  67. result = currentTime + self.interval
  68. # If we are rolling over at midnight or weekly, then the interval is already known.
  69. # What we need to figure out is WHEN the next interval is. In other words,
  70. # if you are rolling over at midnight, then your base interval is 1 day,
  71. # but you want to start that one day clock at midnight, not now. So, we
  72. # have to fudge the rolloverAt value in order to trigger the first rollover
  73. # at the right time. After that, the regular interval will take care of
  74. # the rest. Note that this code doesn't care about leap seconds. :)
  75. if self.when == 'MIDNIGHT' or self.when.startswith('W'):
  76. # This could be done with less code, but I wanted it to be clear
  77. if self.utc:
  78. t = time.gmtime(currentTime)
  79. else:
  80. t = time.localtime(currentTime)
  81. currentHour = t[3]
  82. currentMinute = t[4]
  83. currentSecond = t[5]
  84. currentDay = t[6]
  85. # r is the number of seconds left between now and the next rotation
  86. if self.atTime is None:
  87. rotate_ts = _MIDNIGHT
  88. else:
  89. rotate_ts = ((self.atTime.hour * 60 + self.atTime.minute)*60 +
  90. self.atTime.second)
  91. r = rotate_ts - ((currentHour * 60 + currentMinute) * 60 +
  92. currentSecond)
  93. if r < 0:
  94. # Rotate time is before the current time (for example when
  95. # self.rotateAt is 13:45 and it now 14:15), rotation is
  96. # tomorrow.
  97. r += _MIDNIGHT
  98. currentDay = (currentDay + 1) % 7
  99. result = currentTime + r
  100. # If we are rolling over on a certain day, add in the number of days until
  101. # the next rollover, but offset by 1 since we just calculated the time
  102. # until the next day starts. There are three cases:
  103. # Case 1) The day to rollover is today; in this case, do nothing
  104. # Case 2) The day to rollover is further in the interval (i.e., today is
  105. # day 2 (Wednesday) and rollover is on day 6 (Sunday). Days to
  106. # next rollover is simply 6 - 2 - 1, or 3.
  107. # Case 3) The day to rollover is behind us in the interval (i.e., today
  108. # is day 5 (Saturday) and rollover is on day 3 (Thursday).
  109. # Days to rollover is 6 - 5 + 3, or 4. In this case, it's the
  110. # number of days left in the current week (1) plus the number
  111. # of days in the next week until the rollover day (3).
  112. # The calculations described in 2) and 3) above need to have a day added.
  113. # This is because the above time calculation takes us to midnight on this
  114. # day, i.e. the start of the next day.
  115. if self.when.startswith('W'):
  116. day = currentDay # 0 is Monday
  117. if day != self.dayOfWeek:
  118. if day < self.dayOfWeek:
  119. daysToWait = self.dayOfWeek - day
  120. else:
  121. daysToWait = 6 - day + self.dayOfWeek + 1
  122. newRolloverAt = result + (daysToWait * (60 * 60 * 24))
  123. if not self.utc:
  124. dstNow = t[-1]
  125. dstAtRollover = time.localtime(newRolloverAt)[-1]
  126. if dstNow != dstAtRollover:
  127. if not dstNow: # DST kicks in before next rollover, so we need to deduct an hour
  128. addend = -3600
  129. else: # DST bows out before next rollover, so we need to add an hour
  130. addend = 3600
  131. newRolloverAt += addend
  132. result = newRolloverAt
  133. return result
  134. def shouldRollover(self, record):
  135. """
  136. Determine if rollover should occur.
  137. record is not used, as we are just comparing times, but it is needed so
  138. the method signatures are the same
  139. """
  140. t = int(time.time())
  141. if t >= self.rolloverAt:
  142. return 1
  143. return 0
  144. def getFilesToDelete(self):
  145. """
  146. Determine the files to delete when rolling over.
  147. More specific than the earlier method, which just used glob.glob().
  148. """
  149. dirName, baseName = os.path.split(self.baseFilename)
  150. fileNames = os.listdir(dirName)
  151. result = []
  152. prefix = baseName + "."
  153. plen = len(prefix)
  154. for fileName in fileNames:
  155. if fileName[:plen] == prefix:
  156. suffix = fileName[plen:]
  157. if self.extMatch.match(suffix):
  158. result.append(os.path.join(dirName, fileName))
  159. result.sort()
  160. if len(result) < self.backupCount:
  161. result = []
  162. else:
  163. result = result[:len(result) - self.backupCount]
  164. return result
  165. def doRollover(self):
  166. """
  167. do a rollover; in this case, a date/time stamp is appended to the filename
  168. when the rollover happens. However, you want the file to be named for the
  169. start of the interval, not the current time. If there is a backup count,
  170. then we have to get a list of matching filenames, sort them and remove
  171. the one with the oldest suffix.
  172. """
  173. if self.stream:
  174. self.stream.close()
  175. self.stream = None
  176. # get the time that this sequence started at and make it a TimeTuple
  177. currentTime = int(time.time())
  178. dstNow = time.localtime(currentTime)[-1]
  179. t = self.rolloverAt - self.interval
  180. if self.utc:
  181. timeTuple = time.gmtime(t)
  182. else:
  183. timeTuple = time.localtime(t)
  184. dstThen = timeTuple[-1]
  185. if dstNow != dstThen:
  186. if dstNow:
  187. addend = 3600
  188. else:
  189. addend = -3600
  190. timeTuple = time.localtime(t + addend)
  191. dfn = self.rotation_filename(self.baseFilename + "." +
  192. time.strftime(self.suffix, timeTuple))
  193. if os.path.exists(dfn):
  194. os.remove(dfn)
  195. self.rotate(self.baseFilename, dfn)
  196. if self.backupCount > 0:
  197. for s in self.getFilesToDelete():
  198. os.remove(s)
  199. if not self.delay:
  200. self.stream = self._open()
  201. newRolloverAt = self.computeRollover(currentTime)
  202. while newRolloverAt <= currentTime:
  203. newRolloverAt = newRolloverAt + self.interval
  204. #If DST changes and midnight or weekly rollover, adjust for this.
  205. if (self.when == 'MIDNIGHT' or self.when.startswith('W')) and not self.utc:
  206. dstAtRollover = time.localtime(newRolloverAt)[-1]
  207. if dstNow != dstAtRollover:
  208. if not dstNow: # DST kicks in before next rollover, so we need to deduct an hour
  209. addend = -3600
  210. else: # DST bows out before next rollover, so we need to add an hour
  211. addend = 3600
  212. newRolloverAt += addend
  213. self.rolloverAt = newRolloverAt

   首先日志会被打印在一个叫 baseFilename 名字的文件中。然后在 Rotate 的时候会根据你想要打印的参数生成对应新文件的名字也就是上面函数的 dfn 的值。然后会将现在的文件重命名为 dfn 的值。之后在重新创建一个 baseFilename 的文件。然后继续往这个文件里面写。

当我们一直往 new.log 中写日志。现在该 rotate 了我们会把 new.log rename 成 new.log.2019-5-14 然后再创建一个 new.log 继续写日志,过程就是这样。

让我们来注意导致多进程问题和报错文件被占用 的最关键的几句话:

doRollover 函数中:

  1. if os.path.exists(dfn):
  2. os.remove(dfn)
  3. # Issue 18940: A file may not have been created if delay is True.
  4. if os.path.exists(self.baseFilename):
  5. os.rename(self.baseFilename, dfn)

根据上面的例子继续来描述。比如现在 dfn 就是 new.log.2019-5-14 。那么我会看有没有存在这个文件,如果有我就会先删除掉,然后再看下 new.log 是否存在,如果存在就执行 rename。

所以问题就很明确了,如果同时有多个进程进入临界区,那么会导致 dfn 文件被删除多次,另外下面的 rename 可能也会产生混乱。

现在我们的思路是: 首先认为文件存在即是已经有人 rename 成功过了,并且在判断文件不存在的时候只允许一个人去 rename ,其他进程如果正好进入临界区就等一等。

怎么做呢,加个锁即可。

又有人问?这个跟作者上一篇文章有什么区别呢?

区别还是有的,上篇文章当日志往日志文件写入时,我们加锁。下个线程写入时,需等待前面的线程完成后才能写入。所以性能比较慢:P

这次我们吸取上面的教训,在文件rename的一瞬间,锁住它,完成分割新的一个文件时,再释放它。它的过程是非常快的。其他线程几乎没有等待,继续向文件写入日志。这样日志不会丢失,性能不受影响。

下面贴个代码(重写doRollover方法):

  1. class MultiCompatibleTimedRotatingFileHandler(TimedRotatingFileHandler):
  2. def doRollover(self):
  3. if self.stream:
  4. self.stream.close()
  5. self.stream = None
  6. # get the time that this sequence started at and make it a TimeTuple
  7. currentTime = int(time.time())
  8. dstNow = time.localtime(currentTime)[-1]
  9. t = self.rolloverAt - self.interval
  10. if self.utc:
  11. timeTuple = time.gmtime(t)
  12. else:
  13. timeTuple = time.localtime(t)
  14. dstThen = timeTuple[-1]
  15. if dstNow != dstThen:
  16. if dstNow:
  17. addend = 3600
  18. else:
  19. addend = -3600
  20. timeTuple = time.localtime(t + addend)
  21. dfn = self.baseFilename + "." + time.strftime(self.suffix, timeTuple)
  22. # 兼容多进程并发 LOG_ROTATE
  23. if not os.path.exists(dfn):
  24. f = open(self.baseFilename, 'a')
  25. fcntl.lockf(f.fileno(), fcntl.LOCK_EX)
  26. if os.path.exists(self.baseFilename):
  27. os.rename(self.baseFilename, dfn)
  28. if self.backupCount > 0:
  29. for s in self.getFilesToDelete():
  30. os.remove(s)
  31. if not self.delay:
  32. self.stream = self._open()
  33. newRolloverAt = self.computeRollover(currentTime)
  34. while newRolloverAt <= currentTime:
  35. newRolloverAt = newRolloverAt + self.interval
  36. # If DST changes and midnight or weekly rollover, adjust for this.
  37. if (self.when == 'MIDNIGHT' or self.when.startswith('W')) and not self.utc:
  38. dstAtRollover = time.localtime(newRolloverAt)[-1]
  39. if dstNow != dstAtRollover:
  40. if not dstNow: # DST kicks in before next rollover, so we need to deduct an hour
  41. addend = -3600
  42. else: # DST bows out before next rollover, so we need to add an hour
  43. addend = 3600
  44. newRolloverAt += addend
  45. self.rolloverAt = newRolloverAt

注意,windows 的python模块是没有fcntl模块的,大家可以自己写一个 文件锁 替换它或者goole找一个。

当然也可以参考我上一篇文章:Python 日志logging模块初探及多线程踩坑(1)_小牛呼噜噜的博客-CSDN博客_logging.filehandler

~~

参考:

https://www.cnblogs.com/gide/p/6811927.html  python中给程序加锁之fcntl模块的使用

http://blog.jobbole.com/104331/  Linux 中 fcntl()、lockf、flock 的区别

用 Python 写一个多进程 多线程 兼容的 TimedRotatingFileHandler - piperck - 博客园

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/黑客灵魂/article/detail/963108
推荐阅读
相关标签
  

闽ICP备14008679号