Reference: https://github.com/thunlp/InfLLM/blob/main/inf_llm/chat.py#L168
Reference: https://github.com/thunlp/InfLLM/blob/main/inf_llm/chat.py#L168