Jianshu Unofficial Big Data New Ideas Topic URL Topic Administrator URL Fans and Follow URL Advantages and Disadvantages

Jianshu Unofficial Big Data New Ideas Topic URL Topic Administrator URL Fans and Follow URL Advantages and Disadvantages

The ultimate task of the reptile team is to make a brief book and big data. I have done it once before, and the amount of reading is not bad. Some time ago, Jianshu was also a successful financing, and the Jianshu also has some changes. This time it is also a good opportunity for analysis.

Topic URL

This part has not changed, because Jianshu does not have user-managed urls, we can only start with thematic URLs, which are still popular and cities.

Topic administrator URL

This part is the new idea. Previously, I crawled the author of the feature article, and then crawled the author's fans. After this part, I finished the crawling object. This time, the topic administrator URL is crawled as the first-level user. This part is loaded asynchronously, and the URL of the home page and other topics are different in asynchronous loading (you will know this when you look for the package) ).

Followers and follow URL

We can think of it this way. Basically, the administrator has a lot of fans. Most of this part is like us, people who eat melons; if there are peers, it is to follow users, so that two-way transmission can crawl most users (there are still some users) Can't climb).

pros and cons

This method will be much faster than crawling articles, and there will be much less repeated data (because users will post multiple articles). The disadvantage is that the data may be incomplete.

Reference: https://cloud.tencent.com/developer/article/1155584 Brief Book Unofficial Big Data New Ideas Special URL Special Topic Administrator URL Fans and Follow URL Advantages and Disadvantages-Cloud + Community-Tencent Cloud