让我们一起来构建一个模板引擎（四）

2016-07-06 00:33:00

在上篇文章中我们的模板引擎实现了对 include 和 extends 的支持，到此为止我们已经实现了模板引擎所需的大部分功能。在本文中我们将解决一些用于生成 html 的模板引擎需要面对的一些安全问题。

转义

首先要解决的就是转义问题。到目前为止我们的模板引擎并没有对变量和表达式结果进行转义处理，如果用于生成 html 源码的话就会出现下面这样的问题 ( template3c.py ):

from template3c import Template
t = Template('# {{ title }}')
t.render({'title': 'hello[br]world'})
'# hello[br]world'
很明显 title 中包含的标签需要被转义，不然就会出现非预期的结果。这里我们只对 & " ' > 这几个字符做转义处理，其他的字符可根据需要进行处理。

html_escape_table = {
'&': '&',
'"': '"',
'\'': ''',
'>': '>',
'
转义效果:

html_escape('hello[br]world')
'hello[br]world'
既然有转义自然也要有禁止转义的功能，毕竟不能一刀切否则就丧失灵活性了。

class NoEscape:

def __init__(self, raw_text):    self.raw_text = raw_text

def escape(text):
if isinstance(text, NoEscape):
return str(text.raw_text)
else:
text = str(text)
return html_escape(text)

def noescape(text):
return NoEscape(text)
最终我们的模板引擎针对转义所做的修改如下(可以下载 template4a.py ):

class Template:
def init(self, ..., auto_escape=True):
...
self.auto_escape = auto_escape
self.default_context.setdefault('escape', escape)
self.default_context.setdefault('noescape', noescape)
...

def _handle_variable(self, token):    if self.auto_escape:        self.buffered.append('escape({})'.format(variable))    else:        self.buffered.append('str({})'.format(variable))def _parse_another_template_file(self, filename):    ...    template = self.__class__(            ...,            auto_escape=self.auto_escape    )    ...

class NoEscape:
def init(self, raw_text):
self.raw_text = raw_text

html_escape_table = {
'&': '&',
'"': '"',
'\'': ''',
'>': '>',
'
效果:

from template4a import Template
t = Template('# {{ title }}')
t.render({'title': 'hello[br]world'})
'# hello[br]world'

t = Template('# {{ noescape(title) }}')
t.render({'title': 'hello[br]world'})
'# hello[br]world'
exec 的安全问题
由于我们的模板引擎是使用 exec 函数来执行生成的代码的，所有就需要注意一下 exec 函数的安全问题，预防可能的服务端模板注入攻击（详见使用 exec 函数时需要注意的一些安全问题）。

首先要限制的是在模板中使用内置函数和执行时上下文变量( template4b.py ):

class Template:
...

def render(self, context=None):    """渲染模版"""    namespace = {}    namespace.update(self.default_context)    namespace.setdefault('__builtins__', {})   #

效果:

from template4b import Template
t = Template('{{ open("/etc/passwd").read() }}')
t.render()
Traceback (most recent call last):
File "", line 1, in
File "/Users/mg/develop/lsbate/part4/template4b.py", line 245, in render
result = namespace[self.func_name]()
File "", line 3, in __func_name
NameError: name 'open' is not defined
然后就是要限制通过其他方式调用内置函数的行为:

from template4b import Template
t = Template('{{ escape.globals["builtins"]"open".read()[0] }}')
t.render()
'# '
t = Template("{{ [x for x in [].class.base.subclasses() if x.name == '_wrapclose'][0].init.globals['path'].os.system('date') }}")
t.render()
Mon May 30 22:10:46 CST 2016
'0'
一种解决办法就是不允许在模板中访问以下划线开头的属性。为什么要包括单下划线呢，因为约定单下划线开头的属性是约定的私有属性，不应该在外部访问这些属性。

这里我们使用 dis 模块来帮助我们解析生成的代码，然后再找出其中的特殊属性（最新更新：dist 无法分析嵌套函数的代码，正在查找更安全的办法）。

import dis
import io

class Template:
def init(self, ..., safe_attribute=True):
...
self.safe_attribute = safe_attribute

def render(self, ...):    ...    func = namespace[self.func_name]    if self.safe_attribute:        check_unsafe_attributes(func)    result = func()

def check_unsafe_attributes(code):
writer = io.StringIO()
dis.dis(code, file=writer)
output = writer.getvalue()

match = re.search(r'\d+\s+LOAD_ATTR\s+\d+\s+\((?P_[^\)]+)\)',                  output)if match is not None:    attr = match.group('attr')    msg = "access to attribute '{0}' is unsafe.".format(attr)    raise AttributeError(msg)

效果:

from template4c import Template
t = Template("{{ [x for x in [].class.base.subclasses() if x.name == '_wrap_close'][0].init.globals['path'].os.system('date') }}")
t.render()
Traceback (most recent call last):
File "", line 1, in
File "/xxx/lsbate/part4/template4c.py", line 250, in render
check_unsafe_attributes(func)
File "/xxx/lsbate/part4/template4c.py", line 296, in check_unsafe_attributes
raise AttributeError(msg)
AttributeError: access to attribute 'class' is unsafe.
t = Template('# {{ title }}')
t.render({'title': 'hello[br]world'})
'# hello[br]world'
这个系列的文章到目前为止就已经全部完成了。

如果大家感兴趣的话可以尝试使用另外的方式来解析模板内容, 即: 使用词法分析/语法分析的方式来解析模板内容（欢迎分享实现过程）。

P.S. 整个系列的所有文章地址：

让我们一起来构建一个模板引擎（一）
让我们一起来构建一个模板引擎（二）
让我们一起来构建一个模板引擎（三）
让我们一起来构建一个模板引擎（四）

P.S. 文章中涉及的代码已经放到 GitHub 上了: https://github.com/mozillazg/lsbate

关键字：Python, 模板引擎

本文来自互联网用户投稿，文章观点仅代表作者本人，不代表本站立场，不承担相关法律责任。如若转载，请注明出处。 如若内容造成侵权/违法违规/事实不符，请点击【内容举报】进行投诉反馈！

标签：业界 Pthon 模板引擎

上一篇 > 笨办法学C 练习19：一个简单的对象系统
下一篇 > tornado异步的mock以及装饰器

良心推荐：一份20周学习计算机科学的经验贴（附资源）

我与数据分析不得不说的故事（二）

边玩边学！12个可以在线学习编程的免费游戏酷站

turtlebot下的ros的指令的简易实现

Pug模板（一）

# 0001生成验证码

从零开始搭建论坛（二）：Web服务器网关接口

# 0000在一个图片上画一个数字

打造性感好用的Atom编辑器

Pthon即时网络爬虫项目: 内容提取器的定义(Pthon2.7版本)

[叁]Flask web开发:模板

[贰]Flask web开发:程序的基本结构

让 Angular 1. 跟上时代的步伐

Matplotlib绘图双纵坐标轴设置及控制设置时间格式

[壹] Flask web 开发：安装

pip设置阿里云的镜像源，速度超级快

关于 pthon3 下 msqldb 问题

从零开始搭建论坛（一）：Web服务器与Web框架

Pthon 中的 MSQL 数据库连接池

互联网金融爬虫怎么写－第一课 p2p网贷爬虫（XPath入门）

Node使用C/C++ Addon遇到的问题及解决办法

玩转APP支付

Yet Another shell can run anwhere Pthon eists.

机器学习从入门到放弃之逻辑回归

关于递归的思考

django rest framework 自定义用户以及自定义认证方式

如何从 git reset --hard 中拯救代码

40行代码实现sip注册

Ngin 中 map 模块的使用及性能测试

Flask学习摘要

Flask学习资源整理

Django 学习小组：博客开发实战第四周——标签云与文章归档

让我们一起来构建一个模板引擎（四）

转义

exec 的安全问题

相关文章