Python crawler learning tutorial: grab Taobao MM photos

The goal of this article

1. Grab the name, profile picture and age of Taobao MM

2. Grab the profile and photo pictures of each MM

3. Save each MM's photo picture to the local folder according to the folder

4. Familiar with the process of file saving

1. URL format

The URL we use here is, the base address is in front of the question mark, and the parameter page behind is the number of pages. You can change the address at will. After clicking Open, you will find some introductions of Taobao MM, with hyperlinks to the personal details page.

We need to grab the avatar address of this page, MM’s name, MM’s age, MM’s place of residence, and MM’s personal details page address.

2. Grab brief information

I believe that after several actual battles, you are already very familiar with crawling and extracting the address of the page. There is no difficulty here. We first grab the MM details page address, name, age and other information of this page and print it out. Directly paste the code as follows

The results of the operation are as follows

2. Introduction to File Writing

Here, we have two methods for writing pictures and writing text

1) Write pictures


#Pass in picture address, file name, save a single picture def saveImg(self,imageURL,fileName): u = urllib.urlopen(imageURL) data = f = open(fileName,'wb') f. write(data) f.close()

2) Write text


def saveBrief(self,content,name): fileName = name + "/" + name + ".txt" f = open(fileName,"w+") print u"is secretly saving her personal information as",fileName f. write(content.encode('utf-8'))

3) Create a new directory

3. Perfect code

The main knowledge points have been covered in the previous section. If you have read the previous chapters, it is not a problem to complete this crawler. The specific details will not be repeated here, and the code will be posted directly.

The above two files are all the code content. Run it and try it out.

See what's changed in the folder

