继续挑战,拓展一下
第26题地址decent.html
 
- 网页标题是be a man - apologize!,题目内容为Hurry up, I'm missing the boat,源码里面有一行隐藏内容:<!– you’ve got his e-mail –> 
图片是两只猴子,看不出啥。
结合标题说apologize道歉和提示收到了邮件可以联想到前面第23题bonus.html有一段TODO是要给Leopold发邮件道歉。
首先编写一个config.ini配置文件,内容如下(相应修改成自己的):
[email]
smtp = smtp.gmail.com
addr = ********@gmail.com
pswd = ********
然后使用configparser模块读取配置,并使用email模块生成邮件,使用smtplib模块进行发送:
from configparser import ConfigParser
from email.message import EmailMessage
from smtplib import SMTP_SSL
config = ConfigParser()
config.read('config.ini')
email_config = config['email']
msg = EmailMessage()
msg['From'] = email_config['addr']
msg['To'] = 'leopold.moz@pythonchallenge.com'
msg['Subject'] = 'Apology'
msg.set_payload('Sorry!')
with SMTP_SSL(email_config['smtp'], 465) as smtp:
    smtp.set_debuglevel(2)
    smtp.login(email_config['addr'], email_config['pswd'])
    smtp.send_message(msg)
直到提示
reply: retcode (250); Msg: b’2.0.0 OK …’
就说明发送成功了。
去邮箱里面收信:
From: “Leopold Mozart” leopold.moz@pythonchallenge.com
Subject: Re: my broken zip Re: Apology
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bitNever mind that.
Have you found my broken zip?
md5: bbb8b499a0eef99b52c7f13f4e78c24b
Can you believe what one mistake can lead to?
Leopold说没关系。但是要你去找broken zip,并提供了一个md5,说有一个错误需要你去修复它。
当然还有一件事我们没忘,就是第24题走迷宫解出来的压缩包里面有一个叫mybroken.zip的文件,估计就是那个,我们来看看。
from collections import deque
from io import BytesIO
from zipfile import ZipFile
import numpy as np
import requests
from PIL import Image
with requests.Session() as sess:
    sess.auth = ('butter', 'fly')
    response = sess.get('http://www.pythonchallenge.com/pc/hex/maze.png').content
    img = Image.open(BytesIO(response))
width, height = img.size
img_data = img.load()
visit = np.zeros((width, height))
neighbors = ((-1, 0), (0, -1), (0, 1), (1, 0))
start, end = (width - 2, 0), (1, height - 1)
path = deque()
path.append(start)
visit[start[0], start[1]] = True
while path and path[-1] != end:
    cx, cy = path[-1]
    for dx, dy in neighbors:
        x, y = cx + dx, cy + dy
        if all((0 <= x < width, 0 <= y < height, not visit[x, y], img_data[x, y][2] == 0)):
            path.append((x, y))
            visit[x, y] = True
            break
    else:
        path.pop()
maze = bytes(img_data[x, y][0] for x, y in path)
with ZipFile(BytesIO(maze[1::2]), 'r') as f:
    with f.open('mybroken.zip', 'r') as f_zip:
        mybroken = f_zip.read()
print(mybroken[:20])
b'PK\x03\x04\x14\x00\x00\x00\x08\x00EC\xb22\xe7\x8278\xf1\t'
确实是一个以PK\x03\x04开头的PKZip文件。我们确认一下是否损坏:
from io import BytesIO
from zipfile import ZipFile
with ZipFile(BytesIO(mybroken), 'r') as f:
    print(f.filelist)
[<ZipInfo filename='mybroken.gif' compress_type=deflate filemode='-rw-r--r--' file_size=2803 compress_size=2545>]
里面是一张名为mybroken.gif的图片,把图片打开看看:
from io import BytesIO
from zipfile import ZipFile
from PIL import Image
with ZipFile(BytesIO(mybroken), 'r') as f:
    with f.open('mybroken.gif', 'r') as f_gif:
        try:
            gif = Image.open(f_gif).copy()
        except Exception as e:
            print(f'{e!r}')
BadZipFile("Bad CRC-32 for file 'mybroken.gif'")
文件确实损坏了。
再回看邮件的内容,Leopold提供给了我们一个md5,并且说有一个错误,应该是指只有一个字节出错了。
MD5消息摘要算法(英语:MD5 Message-Digest Algorithm),一种被广泛使用的密码散列函数,可以产生出一个128位(16字节)的散列值(hash value),用于确保信息传输完整一致。MD5由美国密码学家罗纳德·李维斯特(Ronald Linn Rivest)设计,于1992年公开,用以取代MD4算法。这套算法的程序在 RFC 1321 中被加以规范。
将数据(如一段文字)运算变为另一固定长度值,是散列算法的基础原理。
1996年后被证实存在弱点,可以被加以破解,对于需要高度安全性的数据,专家一般建议改用其他算法,如SHA-2。2004年,证实MD5算法无法防止碰撞(collision),因此不适用于安全性认证,如SSL公开密钥认证或是数字签名等用途。From wikipedia.org
我们穷举好了:
from hashlib import md5
def try_fix(data: bytes, md5sum: str) -> bytes:
    for i, b in enumerate(data):
        for fix in range(256):
            if fix == b:
                continue
            repair = data[:i] + bytes([fix]) + data[i + 1:]
            if md5(repair).hexdigest() == md5sum:
                print(f'Index {i}: {bytes([b])} -> {bytes([fix])}')
                return repair
fix = try_fix(mybroken, 'bbb8b499a0eef99b52c7f13f4e78c24b')
Index 1234: b'-' -> b'\xa8'
再次打开图片:
from io import BytesIO
from zipfile import ZipFile
from PIL import Image
with ZipFile(BytesIO(fix), 'r') as f:
    with f.open('mybroken.gif', 'r') as f_gif:
        gif = Image.open(f_gif).copy()
gif

好了,地址改为speed.html,不存在。
再加上题目内容说missing the boat,miss有错过和漏掉的意思,我们把boat加上,speedboat.html,来到了下一题。
有点牵强,就这样吧。
 
      