Python crawler learning tutorial: grab Taobao MM photos

Python crawler learning tutorial: grab Taobao MM photos

Preface

Python is very popular now, with simple syntax and powerful functions. Many students want to learn Python! So the little ones have prepared high-value Python learning video tutorials and related electronic books for everyone. Welcome to receive them!

The goal of this article

1. Grab the name, profile picture and age of Taobao MM

2. Grab the profile and photo pictures of each MM

3. Save each MM's photo picture to the local folder according to the folder

4. Familiar with the process of file saving

1. URL format

The URL we use here is http://mm.taobao.com/json/request_top_list.htm?page=1, the base address is in front of the question mark, and the parameter page behind is the number of pages. You can change the address at will. After clicking Open, you will find some introductions of Taobao MM, with hyperlinks to the personal details page.

We need to grab the avatar address of this page, MM’s name, MM’s age, MM’s place of residence, and MM’s personal details page address.

2. Grab brief information

I believe that after several actual battles, you are already very familiar with crawling and extracting the address of the page. There is no difficulty here. We first grab the MM details page address, name, age and other information of this page and print it out. Directly paste the code as follows

The results of the operation are as follows

2. Introduction to File Writing

Here, we have two methods for writing pictures and writing text

1) Write pictures

1234567

#Pass in picture address, file name, save a single picture def saveImg(self,imageURL,fileName): u = urllib.urlopen(imageURL) data = u.read() f = open(fileName,'wb') f. write(data) f.close()

2) Write text

12345

def saveBrief(self,content,name): fileName = name + "/" + name + ".txt" f = open(fileName,"w+") print u"is secretly saving her personal information as",fileName f. write(content.encode('utf-8'))

3) Create a new directory

3. Perfect code

The main knowledge points have been covered in the previous section. If you have read the previous chapters, it is not a problem to complete this crawler. The specific details will not be repeated here, and the code will be posted directly.

The above two files are all the code content. Run it and try it out.

See what's changed in the folder

Reference: https://cloud.tencent.com/developer/article/1460500 Python crawler learning tutorial: Grab Taobao MM photos-Cloud + Community-Tencent Cloud