一、模擬登陸需要賬號(hào),密碼的網(wǎng)址
一些不需要登陸的網(wǎng)址操作已經(jīng)試過了,這次來用Python嘗試需要登陸的網(wǎng)址,來利用cookie模擬登陸
由于我們教務(wù)系統(tǒng)有驗(yàn)證碼偏困難一點(diǎn),故挑了個(gè)軟柿子捏,賽氪,https://www.
我用的是火狐瀏覽器自帶的F12開發(fā)者工具,打開網(wǎng)址輸入賬號(hào),密碼,登陸,如圖

可以看到捕捉到很多post和get請(qǐng)求,第一個(gè)post請(qǐng)求就是我們提交賬號(hào)和密碼的,

點(diǎn)擊post請(qǐng)求的參數(shù)選項(xiàng)可以看到我們提交的參數(shù)在bian表單數(shù)據(jù)里,name為賬戶名,pass為加密后的密碼,remember為是否記住密碼,0為不記住密碼。
我們?cè)賮砜纯磆eaders,即消息頭

我們把這些請(qǐng)求頭加到post請(qǐng)求的headers后對(duì)網(wǎng)頁進(jìn)行模擬登陸,
Cookie為必填項(xiàng),否則會(huì)報(bào)錯(cuò):
{"code":403,"message":"訪問超時(shí),請(qǐng)重試,多次出現(xiàn)此提示請(qǐng)聯(lián)系QQ:1409765583","data":[]}
便可以創(chuàng)建一個(gè)帶有cookie的opener,在第一次訪問登錄的URL時(shí),將登錄后的cookie保存下來,然后利用帶有這個(gè)cookie的opener來訪問該網(wǎng)址的其他版塊,查看登錄之后才能看到的信息。
比如我是登陸https://www./login后模擬登陸了“我的競賽”版塊https://www./u/5598522

代碼如下:
from urllib import request from http import cookiejar login_url = "https://www./login" "name": "your account","pass": "your password(加密后)" "Accept":"application/json, text/javascript, */*; q=0.01", "Accept-Language":"zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2", "Connection":"keep-alive", "Referer":"https://www./login", "Content-Type":"application/x-www-form-urlencoded; charset=UTF-8", "TE":"Trailers","X-Requested-With":"XMLHttpRequest" postdata = urllib.parse.urlencode(postdata).encode('utf8') #req = requests.post(url,postdata,header) #聲明一個(gè)CookieJar對(duì)象實(shí)例來保存cookie cookie = cookiejar.CookieJar() #利用urllib.request庫的HTTPCookieProcessor對(duì)象來創(chuàng)建cookie處理器,也就CookieHandler cookie_support = request.HTTPCookieProcessor(cookie) #通過CookieHandler創(chuàng)建opener opener = request.build_opener(cookie_support) #創(chuàng)建Request對(duì)象 my_url="https://www./u/5598522" req1 = request.Request(url=login_url, data=postdata, headers=header)#post請(qǐng)求 req2 = request.Request(url=my_url)#利用構(gòu)造的opener不需要cookie即可登陸,get請(qǐng)求 response1 = opener.open(req1) response2 = opener.open(req2) print(response1.read().decode('utf8')) print(response2.read().decode('utf8'))
到此就告一段落了:

ps:有點(diǎn)小插曲,當(dāng)在headers里加入
Accept-Encoding | gzip, deflate, br |
時(shí),最后在 print(response1.read().decode('utf8'))時(shí)便會(huì)報(bào)錯(cuò)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
原因:在請(qǐng)求header中設(shè)置了'Accept-Encoding': 'gzip, deflate'
參考鏈接:https://www.cnblogs.com/chyu/p/4558782.html
解決方法:去掉Accept-Encoding后就正常了
二、模擬登陸網(wǎng)址常用方法總結(jié)
1.通過urllib庫的request庫的函數(shù)進(jìn)行請(qǐng)求
from urllib import request ------------------------------------------------------ response=request.urlopen(url) page_source = response.read().decode('utf-8')
#加headers,由于urllib.request.urlopen() 函數(shù)不接受headers參數(shù),所以需要構(gòu)建一個(gè)urllib.request.Request對(duì)象來實(shí)現(xiàn)請(qǐng)求頭的設(shè)置 req= request.Request(url=url,headers=headers) response=request.urlopen(req) page_source = response.read().decode('utf-8')
------------------------------------------------------- postdata = urllib.parse.urlencode(data).encode('utf-8')#必須進(jìn)行重編碼 req= request.Request(url=url,data=postdata,headers=headers) response=request.urlopen(req) page_source = response.read().decode('utf-8') #聲明一個(gè)CookieJar對(duì)象實(shí)例來保存cookie cookie = cookiejar.CookieJar() #利用urllib.request庫的HTTPCookieProcessor對(duì)象來創(chuàng)建cookie處理器,也就CookieHandler cookie_support = request.HTTPCookieProcessor(cookie) #通過CookieHandler創(chuàng)建opener opener = request.build_opener(cookie_support) # 將Opener安裝位全局,覆蓋urlopen函數(shù),也可以臨時(shí)使用opener.open()函數(shù) #urllib.request.install_opener(opener) #創(chuàng)建Request對(duì)象 my_url="https://www./u/5598522" req2 = request.Request(url=my_url) response1 = opener.open(req1) response2 = opener.open(req2) #或者直接response2=opener.open(my_url) print(response1.read().decode('utf8')) print(response2.read().decode('utf8'))
2.通過requests庫的get和post函數(shù)
----------------------------------------------------------- params={ 'key1': 'value1','key2': 'value2' } real_url = base_url + urllib.parse.urlencode(params) #real_url="https://www./key1=value1&key2=value2" response=requests.get(real_url) response=requests.get(url,params) print(response.text)#<class 'str'> print(response.content)# <class 'bytes'>
login_url = "https://www./login" "name": "1324802616@qq.com","pass": "my password", "Accept":"application/json, text/javascript, */*; q=0.01", "Accept-Language":"zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2", "Connection":"keep-alive", "Referer":"https://www./login", "Content-Type":"application/x-www-form-urlencoded; charset=UTF-8", "TE":"Trailers","X-Requested-With":"XMLHttpRequest" #requests中的post中傳入的data可以不進(jìn)行重編碼 #login_postdata = urllib.parse.urlencode(postdata).encode('utf8') response=requests.post(url=login_url,data=postdata,headers=header)#<class 'requests.models.Response'> json1 = response1.json()#<class 'dict'> json2= json.loads(response1.text)#<class 'dict'> json_str = response2.content.decode('utf-8')#<class 'str'>
#利用session維持會(huì)話訪問其他版塊 -------------------------------------------------------------------- login_url = "https://www./login" "name": "1324802616@qq.com","pass": "my password", "Accept":"application/json, text/javascript, */*; q=0.01", "Connection":"keep-alive", "Referer":"https://www./login", session = requests.session() response = session.post(url=url, data=data, headers=headers) my_url="https://www./u/5598522" response1 = session.get(url=my_url, headers=headers)
|