抖音是当前比较火的app,有时候需要根据需求去抓取指定内容的视频;本次抓取的是关于漫画的,如果想要抓取其他的,直接替换关键字即刻。

另外说明:关于请求的headers还请用自己的,为了方便大家测试我就把我当放上去了,不要把我的账号跑坏了。。。。。         
 O(∩_∩)O哈哈~【另外也以防其他人使用时,该设备被禁而不能用】

 

直接上代码 <https://github.com/hilqiqi0/crawler/tree/master/simple/douyin>
,采用的是scrapy框架,原因:方便快捷
# -*- coding: utf-8 -*- import json import scrapy from scrapy import Request
class douyinSpider(scrapy.Spider): name = 'douyin' chinese_name = "抖音"
allowed_domains = ['aweme.snssdk.com'] headers = { "Host": "aweme.snssdk.com",
"Connection": "keep-alive", "Cookie": "install_id=57005283726;
ttreq=1$4a0dbe45de94b2c261ba4470caadb235daaeba62;
odin_tt=4a4e27b35ee49fea4b87c0bf97cc6045521eb2cb1a6eba2bd57a992e322ce1fe144690bbd3f5b070d4e64fdb4022a3b3;
sid_guard=7f0b613c53151e8f663d5667aa2ab4a8%7C1546931191%7C5184000%7CSat%2C+09-Mar-2019+07%3A06%3A31+GMT;
uid_tt=ba8a9bf01162a3fe45ecee17a222ac70;
sid_tt=7f0b613c53151e8f663d5667aa2ab4a8;
sessionid=7f0b613c53151e8f663d5667aa2ab4a8", "Accept-Encoding": "gzip",
"X-SS-REQ-TICKET": "1546931874666", "X-Tt-Token":
"007f0b613c53151e8f663d5667aa2ab4a83581edb01c9fc144a35eb8deee9efdf03906d2746df2a2fc862d2c264ae23dfd36",
"sdk-version": "1", "X-SS-TC": "0", "User-Agent": "com.ss.android.ugc.aweme/400
(Linux; U; Android 4.4.2; zh_CN; OPPO R11; Build/NMF26X; Cronet/58.0.2991.0)",
"X-Gorgon": "01815c6f0a7f59ff5db3810469ab03ed9145722c3a62c70571", "X-Khronos":
"1546931874", "X-Pods": "a1bf8bdca715069f27f9ab3662c19ccec595b790",
"Content-Length": "0" } headers_video = { "Range": "bytes=0-163840",
"Vpwp-Type": "preloader", # "Vpwp-Raw-Key":
"v0200f840000bf5svk8ckqbibu1vt8jg_h264_540p", "Vpwp-Flag": "0",
"Accept-Encoding": "identity", "Host": "aweme.snssdk.com", # "Connection":
"Keep-Alive", "User-Agent": "okhttp/3.10.0.1" } def start_requests(self): # url
=
"https://aweme.snssdk.com/aweme/v1/search/item/?keyword=漫画&offset=10&count=10&source=video_search&is_pull_refresh=1&hot_search=0&ts=1546931874&js_sdk_version=&app_type=normal&openudid=8cec4b81deae6417&version_name=4.0.0&device_type=OPPO
R11&ssmix=a&iid=57005283726&os_api=19&mcc_mnc=46007&device_id=59343989226&resolution=720*1280&device_brand=OPPO
&aid=1128&manifest_version_code=400&app_name=aweme&_rticket=1546931874668&os_version=4.4.2&device_platform=android&version_code=400&update_version_code=4002&ac=wifi&dpi=240&uuid=863064010113316&language=zh&channel=aweGW&as=a1c594f3425aaceeb44477&cp=4fa6c7592a483fe4e1skao&mas=01801923065ea090cf1bff7e05117a2187ecec2c2c2c46a6a6c686"
url =
"https://aweme.snssdk.com/aweme/v1/search/item/?keyword=漫画&offset=10&count=10&source=video_search&is_pull_refresh=1&hot_search=0&ts=1546931874&js_sdk_version=&app_type=normal&openudid=8cec4b81deae6417&version_name=4.0.0&device_type=OPPO
R11&ssmix=a&iid=57005283726&os_api=19&mcc_mnc=46007&device_id=59343989226&resolution=720*1280&device_brand=OPPO
&aid=1128&manifest_version_code=400&app_name=aweme&_rticket=1546931874668&os_version=4.4.2&device_platform=android&version_code=400&update_version_code=4002&ac=wifi&dpi=240&uuid=863064010113316&language=zh&channel=aweGW"
yield Request(url, callback=self.pares, headers=self.headers) def pares(self,
response): # print response.body infos = json.loads(response.body) for info in
infos["aweme_list"]: url = info["video"]["play_addr"]["url_list"][0] url_key =
info["video"]["play_addr"]["url_key"] self.headers_video["Vpwp-Raw-Key"] =
url_key yield Request(url, callback=self.pares_video,
headers=self.headers_video, meta={ 'dont_redirect': True,
'handle_httpstatus_list': [302] }) def pares_video(self, response): print
response.body
代码运行及其结果:(补充说明:获取的链接具有时效性,第二天再次打开就失效了。。。。)
[root@hilqiqi0 crawler]# scrapy crawl douyin <a
href="http://v6-dy.ixigua.com/d973b2c1e561b08be7cf000ca5c599a9/5c349f01/video/m/220f9431ffedac9426e8e5518b8d4de7f50116129f5a0000833343be2f12/?rc=M2tvOHJ1bWdkajMzZGkzM0ApQHRAbzc7OjkzNjczNDQ8PDs0PDNAKXUpQGczdylAZmxkamV6aGhkZjs0QG1lMTRqYzJnZl8tLWEtL3NzLW8jbyMvMDYwNS0uLS0yMi4tLS4vaTpiLW8jOmAtbyNtbCtiK2p0OiMvLl4%3D">Found</a>.
<a
href="http://v6-dy.ixigua.com/05f1ae2bea2e6395b37c06f9326178f5/5c349ed9/video/m/220c9fb66f341ab4524935c68cb0481ca671157a96e0000869d96125f6c/?rc=M28zbDZ0Nmo6ZjMzOmkzM0ApQHRAbzpEOjUzOTszNDc8PDs0PDNAKXUpQGczdylAZmxkamV6aGhkZjs0QDRnZXEuMWktNl8tLWMtL3NzLW8jbyM2Ni41Li0uLS0yMi4tLS4vaTpiLW8jOmAtbyNtbCtiK2p0OiMvLl4%3D">Found</a>.
<a
href="http://v6-dy.ixigua.com/1fbd744336289c983d326d91571670b3/5c349eda/video/m/2200d71100341084c71a6577908788995cf115a83c4000005946e363fff/?rc=M2Q4aXU7PDY3ZzMzPGkzM0ApQHRAbzlENTszNzszNDg8PDs0PDNAKXUpQGczdylAZmxkamV6aGhkZjs0QHEybzM1M2k2bl8tLS4tL3NzLW8jbyM0MDM2My0uLS0yMi4tLS4vaTpiLW8jOmAtbyNtbCtiK2p0OiMvLl4%3D">Found</a>.
<a
href="http://v9-dy-y.ixigua.com/959de723c3872bc5a733663a28e4f8e2/5c349edf/video/m/22079c23ec7573b4190b81b9c0a5dc79fe9115fadc00000685eab6d47ae/?rc=anczN3RvamdtaTMzOGkzM0ApQHRAb0RJPDM0NjczNDo8PDs0PDNAKXUpQGczdylAZmxkamV6aGhkZjs0QDRrci0tZWRxbl8tLWEtL3NzLW8jbyNDNjAyNi0uLS0yMi4tLS4vaTpiLW8jOmAtbyNtbCtiK2p0OiMvLl4%3D">Found</a>.
<a
href="http://v1-dy.ixigua.com/c319ba7385f67ab91fb31ed047469131/5c349eef/video/m/2208a36559b95e940cf907789bc7e2bb4cb116102664000088c394fcb706/?rc=am86c3U8am45ajMzaWkzM0ApQHRAbzxHPDo0ODszNDs8PDs0PDNAKXUpQGczdylAZmxkamV6aGhkZjs0QG5tLy1jYS9jMF8tLTYtL3NzLW8jbyNDNDQxNi0uLS0yMi4tLS4vaTpiLW8jOmAtbyNtbCtiK2p0OiMvLl4%3D">Found</a>.
<a
href="http://v1-dy.ixigua.com/7f5c4958a6f40819dec2d542c32b3926/5c349ee7/video/m/220b495e66a22f94e00b67fb157ba6b40231161020150000b05ccea21564/?rc=MzlxaWZ1PHk8aTMzPGkzM0ApQHRAb0Y1Mzo1OTszNDMzMzs0PDNAKXUpQGczdylAZmxkamV6aGhkZjs0QGlxYy40aS01cV8tLTQtL3NzLW8jbyNAMDEtLS0uLS0tLS8tLS4vaTpiLW8jOmAtbyNtbCtiK2p0OiMvLl4%3D">Found</a>.
<a
href="http://v3-dy-z.ixigua.com/0df88da3bed274b622375059a6ee92bc/5c349f02/video/m/2200273ad7094364387a78abdc58a34c4d411610e63e0000ac6e571959ff/?rc=Mzo2bmV0NnJ1ajMzNmkzM0ApQHRAb0dEMzs0OTczNDUzMzs0PDNAKXUpQGczdylAZmxkamV6aGhkZjs0QGM2NGcuMXFrNl8tLWMtL3NzLW8jbyNALjQ2Li0uLS0tLS8tLS4vaTpiLW8jOmAtbyNtbCtiK2p0OiMvLl4%3D">Found</a>.
<a
href="http://v3-dy-z.ixigua.com/cafdf619d12fae4371b2ff26b0f763a6/5c349ee4/video/m/220257576c3706843ceb7f847ca8bd42b9311610a18c00008d03f427fff3/?rc=M2hzZ2xmc2ZrajMzZWkzM0ApQHRAb0kzNjozNzkzNDgzMzs0PDNAKXUpQGczdylAZmxkamV6aGhkZjs0QG0yMmovZzY0LV8tLS0tL3NzLW8jbyMvMi4xMC0uLS0tLS8tLS4vaTpiLW8jOmAtbyNtbCtiK2p0OiMvLl4%3D">Found</a>.
<a
href="http://v6-dy.ixigua.com/697db0ead8441fe317f22afeff7bd5bf/5c349ee1/video/m/220e24254b1d04340b09d636f7afcde71c2116102b29000066dc9a3edd32/?rc=MzU0c207OTVoaTMzN2kzM0ApQHRAb0dJOzc0OTMzNDkzMzs0PDNAKXUpQGczdylAZmxkamV6aGhkZjs0QHNjNGVvZWRecF8tLTItL3NzLW8jbyNCNDIwMy0uLS0tLS8tLS4vaTpiLW8jOmAtbyNtbCtiK2p0OiMvLl4%3D">Found</a>.
<a
href="http://v9-dy.ixigua.com/162cb2096fdc655ce23d997d47af5404/5c349ef4/video/m/22009802ae674e246ca9a9d7031cef61d14115ec975000036566fc15ed9/?rc=anNrOGl1PDk1aTMzOGkzM0ApQHRAbzY5NTkzNjczNDwzMzs0PDNAKXUpQGczdylAZmxkamV6aGhkZjs0QGdkMTM0aTUzZF8tLTQtL3NzLW8jbyNANi0uNS0uLS0tLS8tLS4vaTpiLW8jOmAtbyNtbCtiK2p0OiMvLl4%3D">Found</a>.
可以直接复制运行结果中的链接,在网页中运行即可,下面是最后一条的url_video。


友情链接
KaDraw流程图
API参考文档
OK工具箱
云服务器优惠
阿里云优惠券
腾讯云优惠券
华为云优惠券
站点信息
问题反馈
邮箱:[email protected]
QQ群:637538335
关注微信