Python crawler: crawl high-definition pictures you like

Python crawler: crawl high-definition pictures you like

1. Grab different types of pictures

2. Write a GUI interface crawler program, package it into an exe and re-file

3. Difficulties encountered

1. Analyze how to capture different types of pictures

First open the website, you can see the following 6 types of menus

Insert picture description here

Click on different menus and find that the URL is displayed as follows

Big breasted girl: https:/cid = 2

Little butt: https:/cid = 6

You can see that each type of picture corresponds to a different cid value

So if you want to grab different types of pictures, you only need to construct the url

Parameterize the cid and pass it to the url

The specific code is defined below

2. Use tkinter for GUI programming

I have written some essays on tkinter programming before

For example, using python to make a translation tool

Let’s take a look at the final page layout of the program designed this time.

Then I will talk about how to achieve it in detail. The page layout is as follows:

Select image storage path

The captured pictures need to be saved locally on the computer, so I think it’s best to choose any local folder as the storage path.

Later, I surfed the Internet and found that tkinter can achieve this function.

Can be achieved by the askdirectory() method in the tkinter.filedialog module

Below is a piece of sample code found on the Internet

Specific to this example,

(1) Define a text box to store (display) the selected storage path

(2) Set a button to trigger the function of selecting the local path

(3) Define a function to realize the path selection function

When saving the picture later, the path can directly use the value in self.input defined earlier

Choose category

Because the picture is divided into 6 categories, each category corresponds to a cid value, so the cid can be abstracted in advance and regarded as parameter transfer (1) Define a drag box to store the picture type

(2) Depending on the selectivity type, different cid values ​​are returned

3. Fill in the number of crawled pages

Insert picture description here

Custom crawl depth, some crawl the first 5 pages or the first 10 pages

Then pass the value of this text to the url

3. Problems encountered

The name of the downloaded image is invalid, so it cannot be saved

Some pictures have no names and the file name is .jpg, so when saving, it will prompt that illegal characters cannot be saved, and the program will report an error and terminate the operation.

In order to solve this problem, I add a letter to the end of each file name, so that there will be no unnamed pictures

The overall effect is as follows:

ps: I recommend the python zero-based system learning exchange button I built qun: 322795889, there are free video tutorials in the group, development tools, e-books, project source code sharing. Learn python web, python crawler, data analysis, big data, artificial intelligence and other technologies if you don’t understand, you can join in to exchange and learn together and make progress together!

Reference: https://cloud.tencent.com/developer/article/1536531 python crawler: crawl your favorite high-definition pictures-Cloud + Community-Tencent Cloud