最近最少使用的缓存机制，完整实现-技术圈

你好，我是zhenguo

今天结合一道leetcode有意思的题目，设计和实现一个 LRU (最近最少使用) 缓存机制，顺便和读者们加强下双向链表、字典这些数据结构的应用能力。链表增删操作时间复杂度都是O(1)，这是它最强的地方，尤其追求卓越性能的算法场景，应用广泛。同时，在面试中也经常会考察到。但是，链表比较容易出错，如果在项目中应用，务必要多多测试。

1 问题

运用你所掌握的数据结构，设计和实现一个 LRU (最近最少使用) 缓存机制。实现 LRUCache 类：

LRUCache(int capacity) 以正整数作为容量 capacity 初始化 LRU 缓存

int get(int key) 如果关键字 key 存在于缓存中，则返回关键字的值，否则返回 -1 。

void put(int key, int value) 如果关键字已经存在，则变更其数据值；如果关键字不存在，则插入该组「关键字-值」。

当缓存容量达到上限时，它应该在写入新数据之前删除最久未使用的数据值，从而为新的数据值留出空间。

你是否可以在 O(1) 时间复杂度内完成这两种操作？

链接：https://leetcode-cn.com/problems/lru-cache

2 链表

这道题最高效的O(1)实现需要基于链表结构，因此再温习一下链表的基本操作。

链表是一种节点传递结构，所谓的"传递"是靠next变量，以此建立指向下一个节点的关系，可以理解为一条边，仅此而已。

如果想令j节点指向i节点，需要如何做？如果对链表不熟悉，可能想当然的这么操作：

node_j.next = node_i

上面操作后实现效果如下：

这是不对的，节点i指向节点j的边还存在，这不是我们想要的结果！因此，我们需要摘除这条边，那么如何摘除？实际这才是链表的灵活之处，所谓的摘除只不过是一个None赋值操作：

node_i.next = None

上面赋值实现的效果如下：

你看到吗？剪断后，节点i和节点j之间不再有链接关系。所以j节点指向i节点的完整代码如下：

node_i.next = None
node_j.next = node_i

3 实现思路

下面我们再回头问题，要想在O(1)时间复杂度内求解，需要借助字典和双向链表，问题涉及的主要操作包括：

(1). put操作
加入键值对分两种情况讨论：

(1).1 键不存在
(1).2 键存在

(2).get操作，get操作除了具备dict.get的功能外，此处需要定制一个新的功能，即最近使用的节点需要移动到热点区域。而我们知道链表的增删时间复杂度都是O(1)，所以根据这个定制化需求，使用链表是再自然不过的了！

4 完整代码

class Node(object):
    """
    定义双向链表
    """
    def __init__(self, val, pre, next):
        self.val = val
        self.pre = pre
        self.next = next

LRUCache缓存类：

class LRUCache:
    def __init__(self, capacity: int):
        self.cache_dict = {}
        self.linked_dict = {}
        self.capacity = capacity
        self.head, self.tail = None, None

put操作对应代码，主要解决三种情况，这和题目是完整一一对应的，分别为：

如果关键字已经存在，则变更其数据值
如果关键字不存在，则插入该组「关键字-值」
当缓存容量达到上限时，它应该在写入新数据之前删除最久未使用的数据值，从而为新的数据值留出空间。


    def put(self, key: int, value: int) -> None:
        if key in self.cache_dict:  # 如果关键字已经存在，则变更其数据值
            self.cache_dict[key] = value
            self._move_to_tail(key)
        elif len(self.cache_dict) < self.capacity:  # 如果关键字不存在，则插入该组「关键字-值」
            self.cache_dict[key] = value
            self.linked_dict[key] = Node(key, None, None)
            if not self.head:
                self.head = self.linked_dict[key]
            if self.tail:
                self.tail.next = self.linked_dict[key]

            tmp = self.tail
            self.tail = self.linked_dict[key]
            if tmp:
                self.linked_dict[key].pre = tmp
                tmp.next = self.linked_dict[key]

        else:  # 当缓存容量达到上限时，
            # 它应该在写入新数据之前删除最久未使用的数据值，从而为新的数据值留出空间。
            self.cache_dict[key] = value
            self.linked_dict[key] = Node(key, None, None)
            del_key = self.head.val
            self.cache_dict.pop(del_key)
            pre, next = self.linked_dict[del_key].pre, self.linked_dict[del_key].next
            if not pre and not next:  # 删除节点无头无尾
                self.head = self.linked_dict[key]
            elif not pre:  # 删除元素是头节点
                next.pre = None
                self.head = next
            elif not next:  # 删除元素是尾节点
                pre.next = None
                self.tail = pre
            else:
                pre.next = next
                next.pre = pre

            # 然后在尾部连接上key节点
            self.tail.next = self.linked_dict[key]
            self.linked_dict[key].pre = self.tail
            self.linked_dict[key].next = None
            self.tail = self.linked_dict[key]
            # 最后再从linked_dict中移除key
            self.linked_dict.pop(del_key)

get操作，

    def get(self, key: int) -> int:
        if key in self.linked_dict and self.tail != self.linked_dict[key]:
            self._move_to_tail(key)
        return self.cache_dict.get(key, -1)

简单重构一下，抽离出一个移动节点i到tail处的方法_move_to_tail：

def _move_to_tail(self, key):
        """
        移动key节点到tail
        :param key:
        :return:
        """
        pre, next = self.linked_dict[key].pre, self.linked_dict[key].next
        # 从链表中隔断key节点
        if pre and next:
            pre.next = next
            next.pre = pre
        elif next:
            next.pre = None
            self.head = next
        # 如果key节点不在尾部，则移动key节点到尾部
        if self.tail != self.linked_dict[key]:
            self.linked_dict[key].pre = self.tail
            tmp = self.tail
            self.tail = self.linked_dict[key]
            if tmp:
                tmp.next = self.tail
            self.linked_dict[key].next = None

5 后记

链表操作比较容易出错，平时需要多加练习，才能在实际项目和面试中，使用得心应手。

其实链表的应用极为广泛，包括我们熟知的版本管理软件git，里面的多分支和某个分支的管理，都是基于链表操作的。牢固的掌握链表才算是深度掌握算法和数据结构的第一步。