Python crawler realizes automatic login and sign-in functions with code

Python crawler realizes automatic login and sign-in functions with code

This article mainly introduces the code of Python crawler to realize automatic login and sign-in functions. This article introduces you in the form of pictures and texts in detail. It has a certain reference value for everyone's study or work. Friends who need it can refer to

I downloaded something on a material website a few days ago. The points are always not enough. How did the points come? I log in to the website every day to sign in. Of course I can buy them. I don’t want to buy them because I only use them once in a while When I used it, I found that the points were not enough, and I couldn't remember to sign in every day, so there was this tangled thing. What to do, think of a way, so I wrote a small crawler in python, every day to automatically help her sign in to earn points. Not much nonsense, let's talk about the code below.

I am using python3.4 here. If you need to use python2.x, please detour to check other articles.

Tool: Fiddler

First download and install Fiddler. This tool is used to monitor network requests and helps you analyze request links and parameters.

Open the target website: http://www.17sucai.com/, then click login

Okay, don’t rush to log in, open your Fiddler. At this time, there is no network request in Fiddler. Then go back to the page, enter your email and password, click login, and then go to fiddler to see

The first request here is the network request you click to log in. Click this link to see some of your request information on the right

Then click on WebForms to see your request parameters, that is, username and password

Below we have code to implement the login function

import urllib.requestimport urllibimport gzipimport http.cookiejar#Define a method to generate request header information and process cookiedef getOpener(head):# deal with the Cookies<pre name="code" class="python"> cj = http.cookiejar .CookieJar()pro = urllib.request.HTTPCookieProcessor(cj)opener = urllib.request.build_opener(pro)header = []for key, value in head.items():elem = (key, value)header.append( elem)opener.addheaders = headerreturn opener#Define a method to decompress the return information def ungzip(data):try: # Try to decompress print('Decompressing...') data = gzip.decompress(data)print(' Decompression is complete!')except:print('Uncompressed, no need to decompress')return data#Encapsulate header information, disguised as browser header = {'Connection':'Keep-Alive','Accept-Language':'zh -CN,zh;q=0.8','Accept':'application/json, text/javascript, */*; q=0.01','User-Agent':'Mozilla/5.0 (Windows NT 6.1;WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36','Accept-Encoding':'gzip, deflate','X-Requested-With':'XMLHttpRequest','Host': ' www.17sucai.com',}url ='http://www.17sucai.com/auth'opener = getOpener(header)id ='xxxxxxxxxxxxx'#Your username password ='xxxxxxx'#Your password postDict = {'email': id,'password': password,}postData = urllib.parse.urlencode(postDict).encode()op = opener.open(url, postData)data = op.read()data = ungzip(data )print(data)xxxxxxxxxxxxx'#Your username password ='xxxxxxx'#Your password postDict = {'email': id,'password': password,}postData = urllib.parse.urlencode(postDict).encode()op = opener. open(url, postData)data = op.read()data = ungzip(data)print(data)xxxxxxxxxxxxx'#Your username password ='xxxxxxx'#Your password postDict = {'email': id,'password': password,}postData = urllib.parse.urlencode(postDict).encode()op = opener. open(url, postData)data = op.read()data = ungzip(data)print(data)

Okay, then clear your Fiddler, then run this program, take a look at your Fiddler

You can click on this link to see if the request information on the right is the same as the one you requested with your browser

The following is the information printed by the descendants of the program

code=200 indicates successful login

code=200 indicates successful login

After parsing, you need to get the url of the sign-in. Here you need an account that has not signed in. Click the sign-in button on the website, and then use Fiddler to get the link of the sign-in and the required information.

Then click "check in", check the captured url in Fiddler after successful check in

Click on this url to view the header information and cookies needed to access this link on the right. We can use cookies directly after logging in successfully. Python has done a good job of encapsulating the processing of cookies. The following is the right in my code Use of cookies

cj = http.cookiejar.CookieJar()pro = urllib.request.HTTPCookieProcessor(cj)opener = urllib.request.build_opener(pro)

The following is the information returned when the check-in is successful: code=200 means the request is successful, day=1 means one day of continuous check-in, score=20 means the number of points earned

The complete code is released below. Of course, in order to test the code sign-in, you also need an account that you have not signed-in.

import urllib.requestimport urllibimport gzipimport http.cookiejardef getOpener(head):# deal with the Cookiescj = http.cookiejar.CookieJar()pro = urllib.request.HTTPCookieProcessor(cj)opener = urllib.request.build_opener(pro)header = [ ]for key, value in head.items():elem = (key, value)header.append(elem)opener.addheaders = headerreturn openerdef ungzip(data):try: # Try to decompress print('Uncompressing.... .')data = gzip.decompress(data)print('Decompression is complete!') except:print('Uncompressed, no need to decompress') return dataheader = ('Connection':'Keep-Alive','Accept-Language ':'zh-CN,zh;q=0.8','Accept':'application/json, text/javascript, */*; q=0.01','User-Agent':'Mozilla/5.0 (Windows NT 6.1 ; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36','Accept-Encoding':'gzip,deflate','X-Requested-With':'XMLHttpRequest','Host':'www.17sucai.com',}url ='http://www.17sucai.com/auth'opener = getOpener(header)id ='xxxxxxx'password ='xxxxxxx'postDict = {'email': id,'password': password,}postData = urllib.parse.urlencode(postDict).encode()op = opener.open(url, postData)data = op.read()data = ungzip(data)print(data)url ='http://www.17sucai.com/member/signin' #check-in address op = opener.open(url)data = op.read ()data = ungzip(data)print(data)urlencode(postDict).encode()op = opener.open(url, postData)data = op.read()data = ungzip(data)print(data)url ='http://www.17sucai.com/member/signin' #Sign in address op = opener.open(url)data = op.read()data = ungzip(data)print(data)urlencode(postDict).encode()op = opener.open(url, postData)data = op.read()data = ungzip(data)print(data)url ='http://www.17sucai.com/member/signin' #Sign in address op = opener.open(url)data = op.read()data = ungzip(data)print(data)

Compared with logging in, logging in is simply reopening a link after logging in. Since my account has already been logged in, I won’t post a picture of the running code here.

The next thing to do is to write a bat script on your computer, and then add a timed task in the "task plan".

Before that, you also need to configure the python environment variables, so I won't go into details here.

So far, this article about the code for Python crawlers to realize automatic login and sign-in functions is introduced here.

*Disclaimer: This article is organized on the Internet, and the copyright belongs to the original author. If the source information is wrong or infringes on rights, please contact us for deletion or authorization.

Reference: https://cloud.tencent.com/developer/article/1684455 Python crawler realizes automatic login and sign-in function with code-Cloud + Community-Tencent Cloud