StevenPZChan
StevenPZChan
8 min read

Categories

Tags

继续挑战


第20题地址idiot2.html

  • unreal.jpg
  • 网页标题是go away!,题目内容为but inspecting it carefully is allowed.,源码里面没有隐藏内容

图片是一面栅栏,上面写着PRIVATE PROPERTY BEYOND THIS FENCE栅栏后面有私有财产,结合题目内容说但允许仔细观察,估计是图片后面暗藏玄机了。
观察到图片名字是unreal.jpg,意识到问题并不简单。
尝试了一下把图片改成real.jpg发现图片不存在。
最后还是把焦点放在这张unreal.jpg上:

import requests

with requests.Session() as sess:
    sess.auth = ('butter', 'fly')
    response = sess.get('http://www.pythonchallenge.com/pc/hex/unreal.jpg')
    print(response.headers)
{'Content-Type': 'image/jpeg', 'Content-Range': 'bytes 0-30202/2123456789', 'Transfer-Encoding': 'chunked', 'Date': 'Sat, 01 Feb 2020 07:06:41 GMT', 'Server': 'lighttpd/1.4.35'}

咦?看到了些异样。Content-Rangebytes 0-30202/2123456789,总共有好大,但是只返回了前面的一部分,正好是这张图片。
这样一回想题目的内容,说明图片后面的私有财产正是这请求后面的部分:

with requests.Session() as sess:
    sess.auth = ('butter', 'fly')
    header = {'Range': 'bytes=30203-'}
    response = sess.get('http://www.pythonchallenge.com/pc/hex/unreal.jpg', headers=header)
    print(response.text)
    print(response.headers)
Why don't you respect my privacy?

{'Content-Type': 'application/octet-stream', 'Content-Transfer-Encoding': 'binary', 'Content-Range': 'bytes 30203-30236/2123456789', 'Transfer-Encoding': 'chunked', 'Date': 'Sat, 01 Feb 2020 07:06:42 GMT', 'Server': 'lighttpd/1.4.35'}

又玩这种循环迭代的游戏了!那我们继续:

import re
import requests

with requests.Session() as sess:
    sess.auth = ('butter', 'fly')
    header = {}
    for i in range(400):
        response = sess.get('http://www.pythonchallenge.com/pc/hex/unreal.jpg', headers=header)
        response_header = response.headers
        if not response_header['Content-Type'].startswith('image'):
            print(response.text)
        if 'Content-Range' not in response_header:
            break
        
        content_range = response_header['Content-Range']
        print(content_range)
        result = re.match(r'bytes (?P<first>\d+)-(?P<last>\d+)/(?P<total>\d+)', content_range)
        header['Range'] = f'bytes={int(result["last"]) + 1}-'
print(response.headers)
bytes 0-30202/2123456789
Why don't you respect my privacy?

bytes 30203-30236/2123456789
we can go on in this way for really long time.

bytes 30237-30283/2123456789
stop this!

bytes 30284-30294/2123456789
invader! invader!

bytes 30295-30312/2123456789
ok, invader. you are inside now. 

bytes 30313-30346/2123456789

{'Content-type': 'text/html; charset=UTF-8', 'Content-Length': '0', 'Date': 'Sat, 01 Feb 2020 07:06:45 GMT', 'Server': 'lighttpd/1.4.35'}

欲拒还迎的。。。说我们成功invade入侵了。打开invader.html

Yes! that’s you!

Emmm。。。就说了句你是invader入侵者。

想了想后面还有这么多的Content,肯定有有用的信息。从头获取是不让了,不妨从最深处开始?

with requests.Session() as sess:
    sess.auth = ('butter', 'fly')
    header = {'Range': f'bytes={result["total"]}-'}
    for i in range(400):
        response = sess.get('http://www.pythonchallenge.com/pc/hex/unreal.jpg', headers=header)
        response_header = response.headers
        if not response_header['Content-Type'].startswith('image'):
            print(response.text)
        if 'Content-Range' not in response_header:
            break
        
        content_range = response_header['Content-Range']
        print(content_range)
        result = re.match(r'bytes (?P<first>\d+)-(?P<last>\d+)/(?P<total>\d+)', content_range)
        header['Range'] = f'bytes={int(result["first"]) - 1}-'
print(response.headers)
esrever ni emankcin wen ruoy si drowssap eht

bytes 2123456744-2123456788/2123456789
and it is hiding at 1152983631.

bytes 2123456712-2123456743/2123456789

{'Content-type': 'text/html; charset=UTF-8', 'Content-Length': '0', 'Date': 'Sat, 01 Feb 2020 07:06:46 GMT', 'Server': 'lighttpd/1.4.35'}

果然有东西!!

看上去是一段倒序的内容:

print('esrever ni emankcin wen ruoy si drowssap eht'[::-1])
the password is your new nickname in reverse

密码是你的新昵称反过来写。新昵称是啥?应该是invader,反过来是redavni,不知道有什么用,总之先记下来。
我们看后面一段,说有东西藏在了1152983631,不多说,直接获取:

with requests.Session() as sess:
    sess.auth = ('butter', 'fly')
    header = {'Range': 'bytes=1152983631-'}
    response = sess.get('http://www.pythonchallenge.com/pc/hex/unreal.jpg', headers=header)
print(response.headers)
print(response.content[:20])
{'Content-Type': 'application/octet-stream', 'Content-Transfer-Encoding': 'binary', 'Content-Range': 'bytes 1152983631-1153223363/2123456789', 'Transfer-Encoding': 'chunked', 'Date': 'Sat, 01 Feb 2020 07:06:47 GMT', 'Server': 'lighttpd/1.4.35'}
b'PK\x03\x04\x14\x00\t\x00\x08\x00;\xa7\xaa2\xac\xe5f\x14\xa9\x00'

PK\x03\x04开头的是PKZip格式:

ZIP文件格式是一种数据压缩和文档储存的文件格式,原名Deflate,发明者为菲尔·卡茨(Phil Katz),他于1989年1月公布了该格式的资料。
使用任何一种文本编辑器打开Zip文件,都能看到前两个字母为:PK

From wikipedia.org

我们打开来看看:

from io import BytesIO
from zipfile import ZipFile

with ZipFile(BytesIO(response.content), 'r') as f:
    print(f.filelist)
    try:
        with f.open('readme.txt', 'r') as f_readme:
            read_me = f_readme.read()
            print(read_me.decode())
    except Exception as e:
        print(f'{e!r}')
[<ZipInfo filename='readme.txt' compress_type=deflate filemode='-rw-r--r--' file_size=211 compress_size=169>, <ZipInfo filename='package.pack' compress_type=deflate filemode='-rw-r--r--' file_size=239194 compress_size=239246>]
RuntimeError("File 'readme.txt' is encrypted, password required for extraction")

说文件被加密了。想到刚才得到个密码还没用呢:

from io import BytesIO
from zipfile import ZipFile

with ZipFile(BytesIO(response.content), 'r') as f:
    print(f.filelist)
    with f.open('readme.txt', 'r', pwd=b'invader'[::-1]) as f_readme:
        read_me = f_readme.read()
print(read_me.decode())
[<ZipInfo filename='readme.txt' compress_type=deflate filemode='-rw-r--r--' file_size=211 compress_size=169>, <ZipInfo filename='package.pack' compress_type=deflate filemode='-rw-r--r--' file_size=239194 compress_size=239246>]
Yes! This is really level 21 in here. 
And yes, After you solve it, you'll be in level 22!

Now for the level:

* We used to play this game when we were kids
* When I had no idea what to do, I looked backwards.

压缩包readme说我们已经来到了下一题,估计题目就是这个压缩包里面的内容了,解完就是再下一题!

总结:这一题用到了http分段请求的一些知识,更多的是一些正则表达式的操作,巩固了zipfile库的使用。

本题代码地址20_idiot2.ipynb